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EVENT NOTIFICATION SYSTEM TIED TO A zation. Conventional operating systems, for example, pro- 

FILE SYSTEM vide file systems that use hierarchy-based organization prin- 
ciples. Specifically, a typical operating system file system 

PRIORITY CLAIM AND CROSS-REFERENCE ("OS file system") has directories arranged in a hierarchy, 

TO RELATED APPLICATIONS 5 and documents stored in the directories. Ideally, the hierar- 
chical relationships between the directories reflect some 

This application is related to and claims domestic priority relationship between the meanings that have been 

under 35 U.S.C §119(e) from prior U.S Provisional Patent assigned to the directories . similarly, it is ideal for each 

Application Serial No. 60/147,538 filed on Aug 5, 1999 document to be stored in a directory based on some intuitive 

entitled "Internet File System", by Eric Sedlar the entire 1Q relationship between thc contents of thc docume nt and the 

disclosure of which is hereby incorporated by reference as if mcaning assigned to thc directory in which the document is 

fully set forth herein. stored. 

This application is related to U.S. patent application Ser. FIG. 1 illustrates a typical mechanism by which a soft- 
No. 09/251,757 filed on Feb. 18, 1999 now U.S. Pat. No. ware appiication mat creates and uses a file ( such ^ a word 
6,427,123, entitled "Hierarchical Indexing for Accessing 15 processor) stores the me in a hierarchical file system. 
Hierarchically Organized Information in a Relational Referring to FIG. 1, an operating system 104 exposes to an 
System", by Eric Sedlar, the entire disclosure of which is app ii cation 10 2 an application programming interface (API)- 
hereby incorporated by reference as if fully set forth herein. The Ap , tlms exposed aUows lhe app ii cal ion 102 to call 

This application is related to U.S. patent application Ser. routines provided by the operating system. The portion of 

No. 09/571,508, filed May 15, 2000, entitled "Multi-model 20 the OS API associated with routines that implement the OS 

Access to Data", by Eric Sedlar, the entire disclosure of file system is referred to herein as the OS file API. The 

which is hereby incorporated by reference as if filly set forth application 102 calls file system routines through the OS file 

herein. API to retrieve and store data on disk 108. The operating 

This application is related to U.S. patent application Ser. system 104, in turn, makes calls to a device driver 106 that 

No. 09/571,496, filed May 15, 2000, entitled "File System 25 controls access to the disk 108 to cause the files to be 

that Supports Transactions", by Eric Sedlar, the entire dis- retrieved from and stored on disk 106. 

closure of which is hereby incorporated by reference as if xhe OS file system routines implement the hierarchical 

fully set forth herein. organization of the file system. For example, the OS file 

This application is related to U.S. patent application Ser. system routines maintain information about the hierarchical 

No. 09/571,060, filed May 15, 2000, entitled "Stored Query 30 relationship between files, and provide application 102 

Directories", by Eric Sedlar, the entire disclosure of which access to the files based on their location within the hierar- 

is hereby incorporated by reference as if filly set forth chy. 

herein. Tn contrast to hierarchical approaches to organizing elec - 

This application is related to U.S. patent application Ser. 35 tronic informal inn, a relational database stores informati on 

No. 09/571,492, filed May 15, 2000, entitled "Object File in . tables comprised of rows ana columns. Each row i s 

System with Typed Files", by Eric Sedlar, the entire disclo- i dentified bv a unique RowID. Each column represents an 

sure of which is hereby incorporated by reference as if fully a ttribute of a record, and each row represents a particul ar 

set forth herein. re cord. Data is retrieved from the database bysubmitting 

This application is related to U.S. patent application Ser. 40 queries _tc>_Ji-database management syste m (DBMSftha t 

No. 09/571,568, filed May 15, 2000, entitled "On-the-fly manages the database. 

Format Conversion", by Eric Sedlar, the entire disclosure of FIG. 2 illustrates a typical mechanism by which a data- 
which is hereby incorporated by reference as if fully set forth base application accesses information in a database. Refer- 
herein, ring to FIG. 2, database application 202 interacts with a 
This application is related to U.S. patent application Ser. 45 database server 204 through an API provided by the data- 
No. 09/571,696, filed May 15, 2000, entitled "Versioning in base server 204 (a "database API"). The API thus exposed 
Internet File System", by Eric Sedlar and Michael J. allows the database application 202 to access data using 
Roberts, the entire disclosure of which is hereby incorpo- queries constructed in the database language supported by 
rated by reference as if fully set forth herein. th e database server 204. One such language that is supported 

50 by many database servers is the Structured Query Language 

FIELD OF THE INVENTION (SQL). To the database application 202, database server 204 

makes it appear that all data is stored in rows of tables. 
The present invention relates generally to electronic file However, transparent to database application 202, the data- 
systems, and in particular to a system which implements an base 20 4 actually interacts with the operating system 
operating system file system using a database system. 55 104 to store the data as files in the os file system ^ 

TJArvrDnnvm rMr ttjt: iNTuc\mnM operating system 104, in turn, makes calls to device driver 

BACKGROUND OF THE INVENTION m {q ^ ^ ^ {Q be retrieved from and ^ Qn disk 

Humans tend to organize information in categories. The 108. 

categories in which information is organized are themselves Each type of storage system has advantages and limita- 

typically organized relative to each other in some form of 60 tions. A hierarchically organized storage system is simple, 

hierarchy. For example, an individual animal belongs to a intuitive, and easy to implement, and is a standard model 

species, the species belongs to a genus, the genus belongs to used by most application programs. Unfortunately, the sim- 

a family, the family belongs to an order, and the order plicity of the hierarchical organization does not provide the 

belongs to a class. support required for complex data retrieval operations. For 

With the advent of computer systems, techniques for 65 example, the contents of every directory may have to be 

storing electronic information have been developed that inspected to retrieve all documents created on a particular 

largely reflected this human desire for hierarchical organi- day that have a particular filename. Since all directories must 



05/20/2003, EAST Version: 1,03.0002 



US 6,549,916 Bl 

3 4 

be searched, the hierarchical organization does nothing to FIG. 7 is a block diagram of a files table that can be used 

facilitate the retrieval process. to store files within a relational database according to an 

A relational database system is well suited for storing embodiment of the invention; 

large amounts of information and for accessing data in a FIG. 8 is a flowchart illustrating the steps for resolving a 

very flexible manner. Relative to hierarchically organized 5 pathname using a hierarchical index; 

systems, data that matches even complex search criteria may piG. 9 is a block diagram that illustrates a database file 

be easily and efficiently retrieved from a relational database server in greater detail; 

system. However, the process of formulating and submitting n& w k & Wock ^ of a hi6rarchica] index mat 

queries to a" database server is less intuitive than merely indudes M for a s(ored 

traversing a hierarchy of directories, and is beyond the 10 . ,, , ,. „ „ 

technical comfort level of many computer users. FI ? 11 K a ^f txm of a files table ,hat mchldes a 

„ , r t.i row for a stored query directory; 

Currently, application developers are torced to choose . , . . , .n n . • 

whether they want data created by their applications to be FIG - 12 15 a block dia S ram that illustrates a file hierarchy 

accessible through the hierarchical file system provided by * that indudcs a stored ^ er V Rectory; 

operating systems, or through the more complex query 15 FIG. 13 is a block diagram that illustrates a file hierarchy 

interface provided by database systems. In general, if appli- FIG. 14 is a block diagram that illustrates how the file 

cations do not demand the complex search capability of a hierarchy of FIG. 13 is updated in response to an update to 

database system, the applications are designed to store their a document according to one embodiment of the versioning 

data using the more prevalent and simpler hierarchical file techniques described herein; 

system provided by operating systems. This simplifies both 20 FIG 15 ^ a block d i agram that illustrates how the file 

application design and application use, but also limits the hierarchy of FIG. 13 is updated in response to the movement 

flexibility and power with which the data can be accessed. of a document from one folder to another according to one 

On the other hand, if complex search capability is embodiment of the versioning techniques described herein; 

required, the applications are designed to access their data RG 16 ^ a Mock ^ &am illustrating a class hierarchy of 

using query mechanism provided by database systems. fife classes according to an embodiment of the invention; 

While this increases the flexibility and power with which the ___ . . r i i , ui tL * 

, A , j -i i ■ i ■„ FIG. 17 is a block diagram of relational tables that are 

data may be accessed, it also increases the complexity of the _ t . . , 4 , C i * iL t • , 

*. , - ' , 4 . * ,t j • i used in a database-implemented file system that implements 

application, both from the perspective of the designer and . . , r CT ~„ * r _,- i_ j- 

^ '. c , rft ft. a. c the file class hierarchy of FIG. 16, according to one embodi- 

the perspective of the user. It further requires the presence of * r *u - *- j 

a . t t . . j, v i , 30 ment of the mvention; and 
a database system, which imposes an additional expense to 

the application user. FIG - 18 is a block digram that illustrates a computer 

f) j # . * . •♦•11. a^'„ui~ „n™ r system on which embodiments of the invention may be 

Based on the foregoing, it is clearly desirable to allow . 7 J 

applications to access data using the relatively simple OS implemented. 

file APIs. It is further desirable to allow access to that same 35 DETAILED DESCRIPTION OF THE 

data using the more powerful database API. PREFERRED EMBODIMENT 

SUMMARY OF THE INVENTION A +u , , 4 . , , . u , n 

A method and system are provided that allow the same set 

Techniques for managi n g fijesj i n a computer system ar e 0 f data to be accessed through a variety of interfaces, 

provided. According 10 Wie Technique, an association is ^ including a database API and an OS file system API. In the 

established between a type of file system operation, a file, following description, for the purposes of explanation, 

and an interested entity. It is detected when that type of file numerous specific details are set forth in order to provide a 

system operation is performed on the file. In response to thorough understanding of the present invention. It will be 

detecting that that type of file system operation is performed apparent, however, to one skilled in the art that the present 

on the file, a message is sent to the interested entity. 45 invention may be practiced without these specific details. In 

BRIEF DESCRIPTION OF THE DRAWINGS other instances, well-known structures and devices are 

shown in block diagram form in order to avoid unnecessarily 

Hie present invention is illustrated by way of example, obscuring the present invention, 
and not by way of limitation, in the figures of the accom- 
panying drawings and in which like reference numerals refer Architectural Overview 

to similar elements and in which: 50 „ , 

- . iL , -,, . . i FIG. 3 is a block diagram that illustrates the archite cture 

FIG. 1 is a block diagram that illustrates how conven- c - . — p- 2 — rrr n -* „. v_ , 

* At 4 of a system- 300 impl emented according to an emb odiment 

tiona applications store data mrougti me rue system pro- ^r- 

invention. S imilar to the system illustrated in FIG. 2, 

vided by an operating system; syste m 300 includes a database server 20 4 that prov ides a 

FIG. 2 is a block diagram that illustrates how conven- S5 d ^b aseAPi through w hch T data WlFplicatio n 312 can 

tional database applications store data through the database a c cess data managed by database server 204.Jif Q5Uhe 

API provided by a database system; perspective ot " aU entitles that access dg Tmlnaged by 

FIG. 3 is a block diagram that illustrates a system in which database server 204 through the database API, the~ data 

the same set of data may be accessed though a variety of managed by database server 204 is stored in relational tables 

interfaces, including a database API and an OS file system 60 mat can 5e quer i e d using the database language supported 

API? by database server 204 (e.g. SQL). Transparent to those 

FIG. 4 is a block diagram that illustrates translation entities, database server 204 stores the data to disk 108. 

engine 308 in greater detail; According to one embodiment, database server 204 imple- 

F1G. 5 is a block diagram that illustrates a hierarchical ments disk management logic that allows it to store the data 

index; 65 directly to disk and thus avoid the overhead associated with 

FIG. 6 is a block diagram of a file hierarchy that can be the OS file system of operating system 104. Thus, database 

emulated by a hierarchical index; server 204 may cause the data to be stored to disk either by 
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(1) by making calls to the OS file system provided by through the conventional OS file APIs. That means that 

operating system 104, or (2) storing the data directly to disk, conventional applications that have been designed to load 

thus circumventing operating system 104. files by making calls to the standard OS file API provided by 

Unlike the s ystem of FIG. 2. system 300. prov^es a operating systems are able to load files that are constructed 

trans laHoh engine 308 that translates I/O command s 5 on-the-fly from data stored in relational tables. Further, the 

r eceived from operating systems 304a and 304/? into dat a- f ac t that the data originates from relational tables is entirely 

base commands that the translation engine 308 issues to transparent to the applications. 

database server 204. When the I/O commands call tor the Fof { &s&umQ ^ application 312 issues 

storage of data translation engine 3l)S issues dat abase (o ^ & fow Qf data ^ a ^ ^ 

commands to database server 204 io_tause t he data to be in . « , . • , • ^ . , „ , ^ tU 

, . , 4 . , . , , <T ,77 ™, 10 the database maintained by database server 204. Once the 

stored in relational tables managed by database server 204. , , ^ , J . „ M , . , 

Xv ^elTtnTlTO^o' mmands call to r the retrieva l^ data, r ° w ha * been inse f d > application 302a, which is on y 

traSlaUml ngine 308 issues database commands To data- desi S ned to access data usine the relatively simple OS file 

base server 204 to retrieve data from relational tables API provided by operating system 304a, issues a "file open 

managed by database server. Translation engine 308 then command to operating system 304a. In response, operating 

provides the data thus retrieved to the operating system that 15 system 304a issues an I/O command to translation engine 

issu ed the I/O commands. 308, which responds by issuing one or more database 

— To operating systems 304c and 304b, the fact that data commands to database server 204. Database server 204 

passed to translation engine 30 8 is ultimately _slor£tUn a executes the database command (typically in the form of a 

rel ational database ma na ged hv datahase server 204 is database query), thereby causing database server 204 to 

rraRsparen t~T?ecaiise _i ^is transparent to operating sy stems 20 retrieve the row inserted by database application 312. A file 

304a and 3046, it isalso transparent to applications 302a of the file type expected by application 302a is constructed 

and 302fc that are running on platforms that include those from the data contained in the row, and the file thus 

o perating systems. constructed is passed back up to application 302A through 

For example, assume that the user of application 302<2 translation engine 308 and operating system 304a. 

selects a "save file" option provided by the application 302a. 25 s tem 300 not only rektional i y stored data to be 

The application 302a makes a call through the OS File API ]qM b applications that only support conventional OS 

to cause operating system 304* to save the file. Hie oper- fik m a but sys tem 300 also allows information 

ating system 304a issues an I/O command to translation ^ fa Ucations that onl t conventional 0S file 

engine 308 to store the file. Translation engine 308 responds t . ' , r f, .. 

by issuing one or more database commands to database 30 system access to be accessed by da tebase apphcaUons using 

server 204 to cause the database server 204 to store the data conventional querying techniques For example assume that 

contained in the file into relational tables maintained by the application 302a makes an OS call to save a file that it has 

database server 204. Database server 204 may either store created - Th* file save command is passed down through 

the data directly to disk or make calls to the operating system operating system 304a and translation engine 308 to data- 

104 to cause the data to be stored in the OS file system base server 204. Database server 204 receives the "file save" 

provided by operating system 104. If database server 204 command in the form of a database command, issued by 

calls operating system 104, operating system 104 responds translation engine 308, to store the data contained in that file 

by causing the data to be stored on disk 108 by sending into one or more rows of one or more tables contained in the 

commands to device driver 106. database managed by database server 204. Once the data is 

As another example, assume that the user of application stored within the database in that manner, database appli- 

302a selects a "load file" option provided by the application 40 cation 312 may issue database queries to database server 204 

302a. The application 302a makes a call through the OS File to retrieve the data from the database. 
API to cause operating system 304a to load a file. The 

operating system 304a issues an I/O command to translation Emulating OS File System Organization in a 

engine 308 to load the file. Translation engine 308 responds Database 

by issuing one or more database commands to database 45 ...... 

server 204 to cause the database server 204 to retrieve from r As explained above calls made to the filesystem routing 

relational tables maintained by the database server 204 the p£oj amting systems 304a a nd 3046 are ultimately trlns fcted 

data that comprises the file to be retrieved. During the to database commands issued by translation engine 308 to 

retrieval of the data, database server 204 may either retrieve database server 204. According to one embodiment of the 

the data directory or make calls to the operating system 104 50 i nvention, the process ot peri ormmg these translations is 

to cause the data to be retrieved from OS files on disk 108. s m^Fhea by em ula dng wuhlrT database se rver 2$TtEe 

Once the data is retrieved, the desired file is "constructed" c haracteristics on ne me systems implemented by operating 

from the retrieved data. Specifically, the retrieved data is systems^? 04a_and 3U3fr 

placed in a format expected by the application 302a, that With respectto the organizational model, most operating \ \£ 

requested the file. The file thus constructed is passed through 55 systems implement rile systems mat organize ntes in a Di e xj 

the translation engine 308 and operating system 304a up to hierarchy . lHus, the US rile system ca lls made by^agpji ca- 

application 302a. t ipns 302a and 302fc will typically identify a file in terms of 

System 300 incorporates numerous novel features. In the its location wit hin the OS file hierarchy. To simplitylhe 

following sections, these features shall be described in translation of such calls to corresponding database calls^a 

greater detail. It should be understood, however, that the 60 mechanism is provided for emulating a hierarchical file 

specific embodiments are used to describe the features, and sYSjem, within a relational daUrJase ■ system j Qne such 

that the invention is not limited to those specific embodi- mechanism is described in detail in U.S. pat ent application 

ments SeT. No. G&bSiptf, entitled HIERARCHICAL INDEX- 
ING FOR ACCESSING HIERARCHICALLY ORGA- 

OS File System Access to Relationally Stored Data 65 mzED INFORMATION IN A RELATIONAL SYSTEM" 

According to one aspect of the invention, system 300 filed by Eric Sedlar on Feb. 18, 1999, the entire contents of 

allows applications to access data stored in a database which are incorporated herein by reference. 
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Specifically, the "HIERARCHICAL INDEXING" appli- 
cation describes techniques for creating, maintaining, and 
using a hierarchical index to efficiently access information in 
a relational system based on a pathnames, thus emulating a 
hierarchically organized system. Each item that has any 5 
children in the emulated hierarchical system has an index 
entry in the index. ]Jie index entries in the index are link ed 
rnprthrr in a way that reflects the hiej-archical rejatjopshin 
between the items ass o ciated with "the 'index entri es, 
S pecifically, if a parent-child relationship exists between t he 10 
i tems associated with two index entries, then the index en try 
ass ociated with the parent item has a direct link t o the index 
ent ry associated withJhe child item. 

Consequently, pathname resolution is performed by fol- 
lowing direct links between the index entries associated with 15 
the items in a pathname, according to the sequence of the 
filenames within the pathname. By using an index whose 
index entries are linked in this manner, the process of 
accessing the items based on their pathnames is significantly 
accelerated, and the number of disk accesses performed 20 
during that process is significantly reduced. 

Hierarchical Index 

Hierarchical indexes consistent with the invention support 
the pathname-based access method of a hierarchical system, 
moving from parent items to their children, as specified by 
the pathname. According to one embodiment, a hierarchical 
index consistent with the principles of the invention employs 
index entries that include the following three fields: RowID, 
File ID, and Dir_entry_list (stored as an array). 

FTG. 5 shows a hierarchical index 510 which may be used 
to emulate a hierarchical storage system in a database. FIG. 
6 shows the specific file hierarchy that hierarchical index 
510 is emulating. FIG. 7 shows a files table 710, used to 
store the files illustrated in FIG. 6 within a relational 
database. 

Hierarchical index 510 is a table. The RowID column 
contains system generated Ids, specifying a disk address that 
enables database server 204 to locate the row on the disk. 
Depending on the relational database system, RowID may 
be an implicitly defined field that the DBMS uses for 
locating data stored on the disk drive. The FilelD field of an 
index entry stores the FilelD of the file that is associated 
with the index entry. 

According to one embodiment of the invention, hierar- 
chical index 510 only stores index entries for items that have 
children. In the context of an emulated hierarchical file 
system, therefore, the items that have index entries in the 



child's index entry. For example, the Word directory 616 has 
its own entry in hierarchical index 510 (entry 514). Hence, 
the Dir_entry Jist field of index entry 512 includes the 
name of directory 616 ("Word"), the RowID of the index 
entry for directory 616 in hierarchical index 510 ("Y3"), and 
the FilelD of directory 616 ("X3"). As shall be described in 
greater detail, the information contained in the Dir_entry_ 
list field makes accessing information based on pathnames 
much faster and easier. 

Several key principles of the hierarchical index are as 
follows: 

The Dir__entry list information in the index entry for a 

given directory is kept together in as few disk blocks as 
possible, since the most frequently used filesystem 
operations (pathname resolution, directory listing) will 
need to look at many of the entries in a particular 
directory whenever that directory is referenced. In 
other words, directory entries should have a high local- 
ity of reference because when a particular directory 
entry is referenced, it is likely that other entries in the 
same directory will also be referenced. 
The information stored in the index entries of the hierar- 
chical index must be kept to a minimum, so as to fit the 
maximum number of entries in a particular disk block. 
Grouping directory entries together in an array means 
that there is no need to replicate a key identifying the 
directory they are in; all of the entries in a directory 
share the same key. 
The time needed to resolve a pathname should be pro- 
portional to the number of directories in the path, not 
the total number of files in the filesystem. This allows 
the user to keep frequently-accessed files toward the 
top of the filesystem tree, where access time is lower. 
These elements are all present in typical file system 
directory structures, such as the UNIX system of inodes and 
directories. The use of a hierarchical index, as described 
herein, reconciles those goals with the structures that a 
relational database understands and can query, to allow the 
database server to do ad-hoc searches of files in a manner 
40 other than that used in pathname resolution. To do this, the 
database concept of an index must be used: a duplicate of 
parts of the underlying information (in this case, the file 
data) arranged in a separate data structure in a different 
manner designed to optimize access via a particular method 
(in this case, resolution of a pathname in a hierarchical tree). 

Using the Hierarchical Index 

How hierarchical index 510 may be used to access a file 
based on the pathname of the file shall now be described 



25 



35 



45 



hierarchical index 510 are only those directories that are 50 with reference to the flowchart in FIG. 8. It shall be assumed 

parents to other directories and/or that are currently storing f or tne p Urpose 0 f explanation that document 618 is to be 

documents. Those items that do not have children (e.g. accessed through its pathname. The pathname for this file is 

Example.doc, Access, Appl, App2, App3 of FIG. 6) are /Windows/Word/Example.doc, which shall be referred to 

preferably not included. The Dir_entry_list field of the hereafter as the "input pathname". Given this pathname, the 

index entry for a given file stores, in an array, an "array 55 pathname resolution process starts by locating within hier- 

entry" for each of the child files of the given file. archical index 510 the index entry for the first name in the 

For example, index entry 512 is for the Windows direc- input pathname. In the case of a file system, the first name 

tory 614. The Word directory 616 and the Access directory in a pathname is the root directory. Therefore, the pathname 

620 are children of the Windows directory 614. Hence, the resolution process for locating a file within an emulated file 

Dir_entry_Jist field of index entry 512 for the Windows 60 system begins by locating the index entry 508 of the root 



directory 614 includes an array entry for the Word directory 
616 and an array entry for the Access directory 620. 

According to one embodiment, the specific information 
that the Dir_entry_list field stores for each child includes 
the filename of the child and the FilelD of the child. For 
children that have their own entries in the hierarchical index 
510, the Dir_entry list field also stores the RowID of the 



65 



directory 610 (step 800). Because all pathname resolution 
operations begin by accessing the root directory's index 
entry 508, data that indicates the location of the index entry 
for the root directory 610 (index entry 508) may be main- 
tained at a convenient location outside of the hierarchical 
index 510 in order to quickly locate the index entry 508 of 
the root directory at the start of every search. 
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Once the index entry 508 for the root directory 610 has .doc" is selected from the input pathname. At step 806, the 

been located, the DBMS determines whether there are any Dir_entry__list of index entry 514 is searched to find (step 

more filenames in the input pathname (step 802). If there are 808) that there is an array entry for "Example.doc", indi- 

no more filenames in the input pathname, then control eating that "Example.doc" is a child of Word directory 616. 

proceeds to step 820 and the FilelD stored in index entry 508 5 The system also finds that Example.doc has no indexing 

is used to look up the root directory entry in the files table information in hierarchical index 510, and that specific 

710. information pertaining to Example.doc can be found in files 

In the present example, the filename "Windows" follows table 710 using the FilelD X4. Since Example.doc is the 

the root directory symbol "/" in the input pathname. target file to be accessed (i.e. the last filename in the input 

Therefore, control proceeds to step 804. At step 804, the next 10 path), control passes to step 820 where the system uses the 

filename (e.g. "Windows") is selected from the input path- FilelD X4 to access the appropriate row in the files table 

name. At step 806, the DBMS looks in the Dir_entry_Jist 710, and to extract the file body (the BLOB) stored in the 

column of the index entry 508 to locate an array entry body column of that row. Thus, the Example.doc file is 

pertaining to the selected filename. accessed. 

In the present example, the filename that follows the root 15 In accessing this file, only hierarchical index 510 was 

directory in the input pathname is "Windows". Therefore, used. No table scans were necessary. With typical sizes of 

step 806 involves searching the Dir__entry_list of index blocks and typical filename lengths, at least 600 directory 

entry 508 for an array entry for the filename "Windows". If entries will fit in a disk block, and a typical directory has less 

the Dir_entry_list does not contain an array entry for the than 600 entries. This means that the list of directory entries 

selected filename, then control would proceed from step 808 20 in a given directory will typically fit in a single block. In 

to step 810, where an error is generated to indicate that the other words, each index entry of hierarchical index 510, 

input pathname is not valid. In the present example, the including the entire Dir_entry_Jist array of the index entry, 

Dir_entry_Jist of index entry 508 does include an array will typically fit in a single block, and therefore can be read 

entry for "Windows". Therefore, control passes from step in a single I/O operation. 

808 to step 822. 25 In moving from index entry to index entry in the hierar- 

The information in the Dir__entry_list of index entry 508 chical index 510, it is possible that some disk accesses will 

indicates that one of the children of the root directory 610 is need to be performed if the various index entries in the index 

indeed a file named "Windows". Further, the Dir_entry_list reside in different disk blocks. If each index entry entirely 

array entry contains the following information about this fits in a single block, then number of disk accesses, however, 

child: it has an index entry located at RowID Y2, and its 30 will at most be the number of directories in the path. Even 

FilelD is X2. if the size of an average index entry does not fit in a single 

At step 822, it is determined whether there are any more disk block, the number of disk accesses per directory will be 

filenames in the input pathname. If there are no more a constant term, and will not increase with the total number 

filenames, then control passes from step 822 to step 820. In 35 of files in the file system. 

the present example, "Windows" is not the last filename, so The foregoing description of techniques for emulating the 

control passes instead to step 824. hierarchical characteristic possessed by some file systems is 

Because "Windows" is not the last filename in the input merely exemplary. Other techniques may be used to emulate 

path, the FilelD information contained in the Dir_entry_list the hierarchical characteristics of some file systems and 

is not used during this path resolution operation. Rather, ^ protocols. Further, some protocols may not even possess a 

because Windows directory 614 is just part of the specified hierarchical characteristic. Thus, the present invention is not 

path and not the target, files table 710 is not consulted at this limited to any particular technique for emulating the hier- 

point. Instead, at step 824 the RowID (Y2) for "Windows", archical characteristic of some protocols. Further, the 

which is found in the Dir_entry_list of index entry 508, is present invention is not limited to protocols that are hierar- 

used to locate the index entry for the Windows directory 614 45 chical in nature. 

(index entry 512). , . rt , rt „ „, „ ^, ... 

- T> . A 4 - . , 4 f11 jL Emulating Other OS File System Characteristics in 

Consulting the Dir_entry_Jist of index entry 512, the b Database 

system searches for the next filename in the input pathname 

(steps 804 and 806). In the present example, the filename Beyond the hierarchical organization of OS file systems, 

"Word" follows the filename "Windows" in the input path- 50 another characteristic of most OS file systems is that they 

name. Therefore, the system searches the Dir_entry__list of maintain certain system information about the files that they 

index entry 512 for an array entry for "Word". Such an entry store. According to one embodiment, this OS file system 

exists in the Dir_entry_list of index entry 512, indicating characteristic is also emulated within the database system, 

that "Windows" actually does have a child named "Word" Specifically, translation engine 308 issues commands that 

(step 808). At step 822, it is determined that there are more 55 cause the "system" data for a file to be stored in a row of a 

filenames in the input path, so control proceeds to step 824. files table (e.g. files table 710) managed by database server 

Upon finding the array entry for "Word", the system reads 204. According to one embodiment, all or most of the file 

the information in the array entry to determine that an index contents is stored as a large binary object (BLOB) in one 

entry for the Word directory 616 can be found in hierarchical column of the row. In addition to the BLOB column, the files 

index 510 at RowID Y3, and that specific information eo table further includes columns for storing attribute values 

pertaining to Word directory 616 can be found in files table that correspond to those implemented in OS file systems. 

710 at row X3. Since Word directory 616 is just part of the Such attribute values include, for example, the owner or 

specified path and not the target, files table 710 is not creator of the file, the creation date of the file, the last 

consulted. Instead, the system uses the RowID (Y3) to locate modification data of the file, the hard links to the file, the file 

the index entry 514 for Word directory 616 (step 824). 65 name, the size of the file, and the file type. 

At RowID Y3 of hierarchical index 510, the system finds When translation engine 308 issues database commands 

index entry 514. At step 804, the next filename "Example- to database server 204 to perform any file operation, those 
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database commands include statements which cause the that implements the techniques described herein may vary 

attributes associated with the files involved in the operation from implementation to implementation. For example, data- 

to be modified appropriately. For example, in response to base server 204 may store all of the system data supported 

inserting a new row in the files table for a newly created file, by the OS file system of operating system 304a, but only 

translation engine 308 issues database commands to (1) 5 some of the system data supported by the OS file system of 

store in the "owner" column of the row a value that indicates operating system 3046. Alternatively, database server may 

the user who is creating the file, and (2) store in the "creation store ^ of the system data supported by both operating 

date" column of the row a value that indicates the current sys tems 304a and 3046, or less that all of the system data 

date, and (3) store in the "last modify" column a value that supported by any one of the operating systems 304a and 

indicates the current date and time, and (4) store in the "size" 1Q 394^ 

column a value that indicates the size of the BLOB. In A-n**j*ri/-«ij*i_ in,** C i 

. , ,1 n| , « * • As illustrated in FIG. 3, database server 204 stores files 

response to subsequent operations on the rue, the values m . . . ,~„ „ 

iL r . j \c j • j u *L * ■ "iat originate from numerous distinct OS file systems. For 

these columns are modified as required by the operations. , . 4 ~ n . LJjC c *c 

_ * ' r 1 1 . 1flfl . 1 i 1 example, operating system 304a may be different trom 

For example, if translation engine 308 issues a database \. , j ,i_ *■ . in a 

• i_ i .-is *l * * r ^1 * 1 • operating system 3046, and both operating systems 304a 

command that modifies the contents 01 a file stored in a « f . ^ nA t i_ j-<* * * ♦ • ; tuA 

A . . iU , c *u 1 and 3046 may be different from operating system 104. OS 

particular row, then as part of the same operation the C1 4 * A j~ A -. . * j- * u 

f t A . ! « AO . r j.i j * j * file systems 304a and 3046 may have contradictory charac- 

translation engine 30a issues a database command to update t ■ 4 . 1 ci * n 

it , tl , 1^- » 1 1 r 'r*L tenstics. For example. OS file system 304a may allow 

the last modify value of the particular row. Further, 11 the t , • «i_ t_ * < „ <-i /^o ci 

. l lL ■ r *u ci *i_ * w filenames to contain the character T , while OS file system 

modification changes the size of the file, then translation 4A .. . ' . , . 

1AO , . j*u j i j**u 3046 may not. According to one embodiment, in situations 

engine 308 also issues a database command to update the 9n , / L . 4 . & . - no . a , . . , 

(( r „ . - . r 20 such as this, translation engine 308 is configured to lmple- 

size value of the particular row. tnc , fi1 . . c ° , ™ . c 0 , . ,^-» 

r . . «, • ment OS file system-specific rules. Thus, if application 302a 

Another characteristic of most OS file systems is foe a ^ t0 stofe a fi , e whose filename ^ , he character 

ability to provide security on a file-by-file basis For y , translatioD e toe m issues database commands t0 

example, Windows NT VMS and some versions of UNIX mf 204 , 0 rform (he operation . 0n the other 

maintain access control lists that indicate the rights that „ . f Ucation 302fc atle ts ^ store , flle whose 

various entities have with respect to each fik. According to fllename ^ chjracter Y> ^ lrlnslation ine 

one embodiment of the invention, this OS file system , ftC „ „„ 

... . , , ' , 308 raises an error, 
characteristic is emulated withm the database system by 

maintaining a "security table" where each row of the secu- Alternatively, translation engine 308 may be configured to 

rity table contains content similar to an entry of an access 30 implement a single set of rules for all operating systems. For 

control list. For example, a row in the security table contains example, translation engine 308 may implement the rule that 

one column to store a value that identifies a file, another * a filename is not valid in even one operating system 

column to store a value that represents a permission type supported by translation engine 308, then an error will be 

(e.g. read, update, insert, execute, change permission), raised even if the filename is valid in the operating system 

another column that stores a flag to indicate whether the 35 that ±e command that specified the filename, 

permission is granted or denied, and an owner column to ^ , . ™ r-t o ^ n r> 1 

store a value that represents the owner of that permission for Translating OS File System Calls to Database 

that file. The owner may be a single user, identified by a Uuenes 

userid, or a group, identified by a groupid. In the case of a Having built mechanisms to emulate OS file system 

group, one or more additional tables are used to map the 40 characteristics within a database system, the translation of 

groupid to the userids of the members of the group. os file system ca u s t0 database queries may be performed 

Prior to issuing database commands that access a file by translation engine 308 without losing the functionality 

stored in the files table managed by database server 204, expected by the applications that are making the OS file 

translation engine 308 issues database commands to verify system calls. The OS file system calls made by those 

that the user that is requesting the access has permission to 45 applications are made through the OS file API provided by 

perform the type of access requested for the specified file. t he operating systems in which they are executing. For 

Such pre-access database commands would retrieve data example, for programs written in the "C" programming 

from the security table to determine whether the user that is language, a source code file entitled "stdio.h" is used to 

requesting access has permission to perform the access. If specify the interface of the OS file API of an operating 

the data thus retrieved indicates that the user does not have 50 system. The stdio.h file is included by applications so that 

the required permission, then translation engine 308 does the applications will know how to invoke the routines that 

not issue the commands that perform the requested opera- implement the OS file API. 

tion. Instead, translation engine 308 provides an error mes- r« * c t . . • , . r\<? a\ at»f 

. , , ■ r , • . .i . The specific r outines that imp lement an OS file API may 

sage back to the operating system from which the request ^^ ^^ S y Slem Z operating sys T em, but typ i- " 

ongmated In response to the error message the opera . ng 55 ^ indud ; rou t j nes ? w perfonn the follow in g operatio n s: 

system sends the same OS error message to the application — tt -ft — * • " — £i ? — *.u° ai — T 1 

/ , , , . & ,11 open file, read from file^write to file, seek within a file, lock 

that requested the access las the operating system would ^send ^ and ^ ^ ^ ^ m frQm ^ 

if the application had attempted to access, withou c ^ ands to re l aliona l database commands is: 

permission, a file maintained in the OS file system or that * 

operating system. Thus, even under error conditions, the fact eo 

that the data is stored in a relational database rather than in 



the OS file system is transparent to the application. open file =begin transaction, resolve pathname to locate row that 

Different operating systems store different types of system contains file 

information about files. For example, one operating system wn * c t0 fi k -update 

* ' r ° J read from file -select 

may store an "archive" flag but no icon information, while 65 lock ^ . .i ock row assoc iated with file 

another may store icon information but do archive flag. The seek in file -update counter 
specific set of system data maintained by a database system 
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close file -commit transaction (the Windows OS file system 

protocol requires that the directory entry be committed 
immediately before the file data is written. Other 
protocols do not.) 



As will be discussed in greater detail hereafter, some fi le 
s ystems expect trie name of a iile to be visible even bef ore 
the contents of the file have been received. In the context o f 
t hose tile systems, the "open file" I/O command correspo nds 
to a b egin transaction fo r writing the name and a coTnmit 
tra nsaction to t wnting" the name, as well as a begin tran s- 
acuon J pL writing the content. 
" According to one embodiment, a counter is used to track 
the "current location" within a file. In embodiments where 
the files are stored as BLOBs, the counter may take the form 
of an offset from the beginning of a BLOB. Upon the 
execution of an "open file" command, a counter is created 
and set to a value that indicates the starting address of the 
BLOB in question. The counter for a BLOB is then incre- 
mented in response to data being read from or written to the 
BLOB. Seek operations cause the counter to be updated to 
point to the location within the BLOB dictated by the seek 
operation's parameters. According to one embodiment, 
these operations are facilitated through the use of LOB 
Locators, as described in U.S. patent application Ser. No. 
08/962,487 entitled "LOB LOCATORS", filed Oct. 31, 1997 
by Nori et. al., the entire contents of which is incorporated 
herein by reference. 

In some operating systems, OS locks may persist beyond 
the closing of a file. To emulate this feature, the lock file 
command is translated to a request for a session lock. 
Consequently, when the "commit transaction" is performed 
in response to the close file command, the lock on the row 
associated with the file is not automatically released. The 
lock thus established is released either explicitly in response 
to an unlock file command, or automatically in response to 
the termination of the database session through which the 
lock was acquired. 

ln-Progress I/O Operations 

When a file is created, the directory in which the file is 
created is updated to indicate the presence of the file. In 
some OS file systems, the modification to a directory to 
show a new file is committed before the new file is entirely 
generated. Some applications designed for those OS file 
systems take advantage of that feature. For example, an 
application may open a new file with a first file handle, and 
proceed to write data into the file. While the data is being 
written, the same application may open the file with a second 
file handle. 

Emula ting th is feature within the database. involves spe - 
ciaTissues because, in general, until a database transactio n 
Cpmmr tjS. anotner transaction is not able to see thejebagges 
mad e by the transaction. F or example, assume that a tirs t 
database transaction is initiated in response to the fi rst 
"open" command. The first transaction updates a directo ry 
table to indicate that the rile exists in a particular di rectory, 
and then updates a files table to inser t a rnw thai-rnn tains the 
file, if a second database transaction is initiated in response 
to a second open command, issued by the same application, 
the second database transaction will not see either the 
change to the directory table nor the new row in the files 
table until the first transaction commits. 

Accord ing to one embodiment of the invention, the ability 
to see the directory entry ot a rile whose creation is in 
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progress is emulated in a database system by causing the 
update to the directory table to be performed as a separate 
transaction than the transaction used to insert the row for the 
file in the files table. Thus, in response to the first open 

5 command, translation engine 308 issues database commands 
to (1) start a first transaction, (2) change the directory table 
to indicate the existence of the new file, (3) commit the first 
transaction, (4) start a second transaction, (5) insert a row for 
the file into the files table, and (6) commit the second 

10 transaction. By committing the change to the directory table 
separate from the change to the files table, a third 
transaction, initiated in response to a second open command, 
may see the entry in the directory table while the insertion 
into the files table is still in progress. If the second trans- 

15 action fails, then the directory will be left with an entry for 
a^j le with no content. 

The Translation Engine 

Ar^Hinp \ n ™n p emb odiment of the invention, transla- 

20 t yin en f n'pe 308 is designed in two layers. Those layers a re 
illust rated in FIG. 4. Referring to FIG. 4, t ranslation engi ne 
30 ft includes a protocol server layer, and a DB file server 40 8 
la yer. DB file s erver 408 allows applications to access d ata 
st ored in the database managed by data base server 204 

25 t hrough an alternative yfl^l, reierred toherein as tneTJB ti le 
A PI. The DB file API combines aspects of both an OS fi le 
API and the database APL Specifically, the DB file AP I 
sn p pnrtr fi lm op e ration al s imilar fn those sup port ed by con - 
venfi npal OS file APT*. 

30 However, unlike OS file APIs, the DB file API incorpo- 
rates the database API concept of transactions. That is, the 
DB file API allows applications to specify that a set of file 
operations are to be performed as an atomic unit. The 
benefits of having a transacted file system are described in 

35 greater detail hereafter. 

DB File Server 

The DB file server 408 is responsible for translating DB 
file API commands to database commands. The DB file API 

4Q commands received by DB file server 408 may come from 
the protocol server layer of translation engine 308, or 
directly from applications (e.g. application 410) specifically 
designed to perform file operations by issuing calls through 
the DB file API. 

45 According to one embodiment, DB file server 408 is 

nfrjprt nrientfld Thus, the rrfli fine s supplied by DB file serv er * 

40 8 are invoked by instantiating an object and calli ng 
melEods associated with the object. In one implementation , 
th e_DB file server 408 defines a "transaction " o bject clas s 

50 th at includes the following methods: i nsert, save, updat e, 
d elete, commit and rolj l-fr flck. T h e UB file API provides an 
i nterface mat allows external entities to instantiate and u se 
th e transaction object class. 
S pecifically T when an ex ternal entity (e. g. application 41 0 

55 or a protocol serve r) make s a calf to DB file server 408 to 
in stantiate a transaction object, DB file se rver 408 sends a 
d atabase comm and to database serv er 204 to begin a new 
tr ansaction. The external entity then invokes the methods o f 
th e transaction object. The invocation of a method results in 

eo a gall to pB file server 408. DB file server 408 responds to 
t he call by issuing cor responding d atabase command's to 
d atabase server 204. All database operations that are pe r- 
f ormed in response to t he invocation of metho ds of a giv en 
tr ansaction object are performe d as part ol the database 

65 tra nsaction associated w ith the given transaction objec t. 
S i gnificantfy ? the methods invoked on a single transacti on 
* object may involve multiple rile operations. For example , 
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application 410 m ay int e ract with DB file serv er 408 as 
tonows: Application 4i(F instantiates a transaction object 
TXOl by making a call throu gh __the DB file APE" I n 
response, DB file server 408 issues a database c ommand to 
sta rt a transacu*on~TXl wittnn database server 2047 App li- 
c ation 410 invokes the up date method of TXOl to up^at^ a 
file Fl stored in theHatabase managed by database serve r 
204-I n response. DB file server 408 issues a databa se 
command to database server 204 to cause the requeste d 
u pdate to be performed as part of transaction TX1. Appl i- 
cation 410 invokes the , updatemethpd of TXOl to update a 
s econd file F2 stored in the database managed by d atabas e 
se rver 20 4. In response, DB file server 4081551165 a databa se 
c otrmTahd~f5 database server 204 to cause the request ed 
u pdate to ^e^per fcrmed aslnjirt oJLtransaction TX1 . Appli- 
cation 410 then invokes the commit method of TXOl. In 
response, DB file server 408 issues a database command to 
database server 204 to cause TX1 to be committed. If the 
update to file F2 had failed, then the roll-back method of 
TXOl is invoked and all changes made by TX1, including 
the update to file Fl, are rolled back. 

While techniques have been described herein with refer- 
ence to a DB file server that uses transaction objects, other 
implementations are possible. For example, within the DB 
file server, objects may be used to represent files rather than 
transactions. In such an implementation, file operations may 
be performed by invoking the methods of the file objects, 
and passing thereto data that identifies the transaction in 
which the operations are to be executed. Thus, the present 
invention is not limited to a DB file server that implements 
any particular set of object classes. 

For the purpose of explanation, the embodiment ill us- 
trated in FIG. 4 shows DB file server 408 as a proc ess 
e xecuting outside database server 204 that communicat es 
w ith database ser ver 204 through the database AP I. 
However, according to an alternative embodiment, the fun c- 
ti onality of DB tile server 408 is b uilt into dat abase serv er 
204 . By build ing Dri hie server 408 into datab ase serve r 
204, tne amount or inter-process com munication generate d 
d uring the use ot the DB file system is reduced. The databas e 
se rver produced by incorpo rating DB file server 408 into 
d atabase server 204 would h ereto re provide two alternati ve 
APIs tor accessing data managed b y the database. serve r 204: 
tne DB file API and tffe Xlabase API (SQL). 

Protocol Servers 

The protocol server layer of translation engine 308 is 
responsible for translating between specific protocols and 
DB file API commands. For example, protocol server 406a 
translates I/O commands received from operating system 
304a to DB file API commands that it sends to DB file server 
408. Protocol server 406a also translates DB file API com- 
mands received from DB file server 408 to I/O commands 
that it sends to operating system 304a. 

In practice, there is not a one-to-one correspondence 
between protocols and operating systems. Rather, many 
operating systems support more than one protocol, and 
many protocols are supported by more then one operating 
system. For example, a single operating system may provide 
native support for one or more of network file protocols 
(SMB, FTP, NFS), e-mail protocols (SMTP, IMAP4), and 
web protocols (HTTP). Further, there is often an overlap 
between the sets of protocols that different operating sys- 
tems support. However, for the purpose of illustration, a 
simplified environment is shown in which operating system 
304A supports one protocol, and operating system 3046 
supports a different protocol. 
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The I/O API 

As mentioned above, protocol servers are used to translate 
I/O commands to DB file commands. The interface between 
the protocol servers and the OS file systems with which they 

5 communicate is generically labeled I/O API. However, the 
specific I/O API provided by a protocol server depends on 
both (1) the entity with which the protocol server 
communicates, and (2) how the protocol server is to appear 
to that entity. For example, operating system 304a may be 

io Microsoft Windows NT, and protocol server 406a may be 
designed to appear as a device driver to Microsoft Windows 
NT. Under those conditions, the I/O API presented by 
protocol server 406c to operating system 304a would be a 
type of device interface understood by Windows NT. Win- 

15 dows NT would communicate with protocol server 406a as 
it would any storage device. The fact that files stored to and 
retrieved from protocol server 406a are actually stored to 
and retrieved from a database maintained by database server 
204 is completely transparent to Windows NT. y 

20 fflfr ile some pr otocol servers used by trans lation engine 
308 may present device driver interfaces to meirTespective - 
operating systems, other protocol servers may appear as 
other types of entities. For example, operating system 304a 
may be the Microsoft Windows NT operating system and , 

25 protocol server 406a presents itself as a device driver, while 
operating system 3046 is the Microsoft Windows 95 oper- 
ating system and protocol server 4066 presents itself as a 
System Message Block (SMB) server. In the latter case, 
protocol server 4066 would typically be executing on a 

30 different machine than the operating system 3046, and the 
communication between the operating system 3046 and 
protocol server 4066 would occur over a network connec- 
tion 

In the examples given above, the source of the I/O 

35 commands handled by the protocol servers are OS file 
systems. However, translation engine 308 is not limited to 
use with OS file system commands. Rather, a protocol server 
may be provided to translate between the DB file commands 
and any type of I/O protocol. Beyond the I/O protocols used 

40 by OS file systems, other protocols for which protocol 
servers may be provided include, for example, the File 
Transfer Protocol (FTP) and the protocols used by electronic 
mail systems (POP3 or IMAP4). 
Just as the interface provided by the protocol servers that 

45 work with OS file systems is dictated by the specific OS, the 
interface provided by the protocol servers that work with 
non-OS file systems will vary based on the entities that will 
be issuing the I/O commands. For example, a protocol server 
configured receive I/O commands according to the FTP 

50 protocol would provide the API of an FTP server. Similarly, 
protocol servers configured to receive I/O commands 
according to the HTTP protocol, the POP3 protocol, and the 
IMAP4 protocol, would respectively provide the APIs of an 
HTTP server, a POP3 server, and an IMAP4 server. 

55 Similar to OS file systems, each non-OS file protocol 
expects certain attributes to be maintained for its files. For 
example, while most OS file systems store data to indicate 
the last modified date of a file, electronic mail systems store 
data for each e-mail message to indicate whether the e-mail 

60 message has been read. The protocol server for each specific 
protocol implements the logic required to ensure that the 
semantics its protocol are emulated in the database file 
system. 

65 Transacted File System 

Within database systems, operations are generally per- 
formed as part of a transaction. The database system per- 
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forms all of the operations that are part of a transaction as a 
single atomic operation. That is, either all of the operations 
are completed successfully, or none of the operations are 
performed. During the execution of a transaction, if an 
operation cannot be performed, all of the previously 
executed operations of that transaction are undone or "rolled 
back" 

In contrast to database systems, OS file systems are not 
transaction based. Thus, if a large file operation fails, the 
portion of the operation that was performed prior to the 
failure remains. The failure to undo incomplete file opera- 
tions can lead to corrupt directory structures and files. 

A ccordin g to one aspect of the invention, a tran sacted fi le 
s ystem is proviaea. As mentioned above, translator! engine 
308 conve rts I/O commands to database statements that are 
sent to database ser ve r 204. fhe Series Ofstatemerits sent by 
t ranslation engine 308 to execute a specified I/O ope ration is 
preceded by a be gin transactio n statement, and ended with 
a close transac tion statement. L,'onsequentl)^"if any failure 
occurs during the execution of those statements by database 
server 204, then all of the changes made as part of that 
transaction by database server 204 up to the point of the 

failu re will be rolled back. 

1 The events that cause the failure of a transaction may vary 
based on the system from which the I/O commands origi- 
nate. For example, an OS file system may support the 
concept of signatures, where a digital "signature" identify- 
ing the source of a file is appended to the file. A transaction 
that is initiated to store a signed file may fail, for example, 
if the signature of the file being stored is not the expected 
signature. 

On-The-Fly Intelligent File Conversion 

According to one aspect of the invention, files are pro- 
cessed prior to insertion into a relational database, and 
processed again as they are retrieved from the relational 
database. FIG. 9 is a block diagram that illustrates the 
functional components of DB file server 308 that are used to 
perform the inbound and outbound file processing. 

Referring to FIG. 9, translation engine 308 includes a 
rendering unit 904 and a parsing unit 902. In general, parsing 
unit 902 is responsible for performing the inbound process- 
ing of files, and rendering unit 904 is responsible for 
* performing the outbound processing of files. Each of these 
functional units shall now be described in greater detail. 

Inbound File Processing 

Inbound files are passed to DB file server 408 through the 
DB file API. Upon receiving an inbound file, parsing unit 
902 identifies the file type of the file, and then parses the file 
based on its file type. During the parsing process, parsing 
unit 902 extracts structured information from the file being 
parsed. The structured information may include, for 
example, information about the file being parsed, or data that 
represents logically distinct components or fields of the file. 
This structured information is stored in the database along 
with the file from which the structured information was 
generated. Queries may then be issued to the database server 
to select and retrieve files based on whether the structured 
information thus extracted satisfies particular search criteria. 

The specific techniques used by parsing unit 902 to parse 
a document, and the structured data generated thereby, will 
vary based on the type of document that is passed to the 
parsing unit 902. Thus, prior to performing any parsing 
operations, parsing unit 902 identifies the file type of the 
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document. Various factors may be taken into account to 
determine the file type of a file. For example, in DOS or 
Windows operating systems, the file type of a file is fre- 
quently indicated by an extension in the filename of the file. 

5 Thus, if the filename ends in ".txt", then parser unit 902 
classifies the file as a text file, and applies the text-file - 
specific parsing techniques to the file. Similarly, if the 
filename ends in ".doc", then parser unit 902 classifies the 
file as a Microsoft Word document and applies Microsoft- 

10 Word -specific parsing techniques to the file. In contrast, the 
Macintosh Operating System stores file type information for 
a file as a attribute maintained separate from the file. 

Other factors that may be considered by parsing unit 902 
to determine the file type of a file include, for example, the 

15 directory in which the file ist located. Thus, parser unit 902 
may be configured to classify and parse all files that are 
stored in the \WordPerfect\documents directory as WordPer- 
fect documents, regardless of the filenames of those files. 
Alternatively, both the file type of an inbound file and the 

20 file type required by a requesting entity may be specified by 
or inferred through information provided to DB file server 
408. For example, when a web browser sends a message, the 
message typically includes information about the browser 
(e.g. the browser type, version, etc.). When a web browser 

25 requests a file through an HTTP protocol server, this infor- 
mation is passed to DB file server 408. Based on this 
information, rendering unit 904 may look up information 
about the capabilities of the browser and infer from those 
capabilities the best file type to deliver to the browser. 

30 As mentioned above, the specific parsing techniques used 
by parsing unit 902, and the type of structured data thus 
generated, will vary based on the type of file that is being 
parsed. For example, the structured data generated by pars- 

35 ing unit 902 may include embedded metadata, derived 
metadata, and system metadata. Embedded metadata is 
information embedded within the file itself. Derived meta- 
data is information that is not contained within the file, but 
which can be derived by analyzing the file. System metadata 
is data about the file provided by the system from which the 
file originates. 

For example, assume that application 410 passes a 
Microsoft Word document to parsing unit 902. Parsing unit 
902 parses the document to extract information about the file 
45 that is embedded within the file. The information embedded 
in a Microsoft Word document, for example, may include 
data that indicates the author of the document, a category to 
which the document has been assigned, and comments about 
the document. 

50 In addition to locating and extracting embedded informa- 
tion about the Word document, parser 902 may also derive 
information about the document. For example, parser 902 
may scan the Word document to determine how many pages, 
paragraphs and words are contained in the document. 

55 Finally, the system in which the document originated may 
supply to parsing unit 902 data that indicates the size, 
creation date, last modification date, and file type of the 
document. 

The more structured the file type of a document, the easier 
60 it is to extract specific items of structured data from the 
document. For example, an HTML document typically has 
delimiters or "tags" that specify the beginning and end of 
specific fields (title, headingl, heading2, etc). These delim- 
iters may be used by parsing unit 902 to parse the HTML 
65 document, thus producing an item of metadata for some or 
all of the delimited .fields. Similarly, XML files are highly 
structured, and the XML parser could extract a separate item 
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of metadata for some or all of the fields contained in the 
XML document. 

Once the parsing unit 902 has generated structured data 
for a file, DB file server 408 issues database commands to 
database server 204 to cause the file to be inserted into a row * 
of a files table (e.g. files table 710). According to one 
embodiment, the database commands thus issued store the 
file as a BLOB in one column of the row, and store the 
various items of structured data generated for the file in other 
columns of the same row. 10 

Alternatively, some or all of the structured data items for 
a file may be stored outside the files table. Under such 
circumstances, the rows that store structured data associated 
with a file would typically contain data that identifies the 
file. For example, assume that a Word document is stored in 15 
row R20 of the files table, and that the system metadata (e.g. 
creation date, modification date, etc.) for that Word docu- 
ment is stored in row R34 of a system attributes table. Under 
these circumstances, both R20 of the files table and R34 of 
the system attributes table would typically contain a FilelD 20 
column that stores a unique identifier for the Word docu- 
ment. Queries can then retrieve both the file and the system 
metadata about the file by issuing a join statement that joins 
rows in the files table to rows in the system attributes table 
based on the FilelD values. A technique for storing file 25 
attributes in tables associated with file "classes" is described 
in greater detail hereafter. 

Outbound File Processing 

Outbound files are constructed by rendering unit 904 30 
based on information retrieved in response to database 
commands sent to database server 204. Once constructed, an 
Outbound file is delivered, through the DB file API, to the 
entity that requested it. 

Significantly, the file type of the outbound file produced 35 
by rendering unit 904 (the target file type) is not necessarily 
the same file type as the file that produced the data that is 
used to construct the outbound file (the source file type). For 
example, rendering unit 904 may construct a text file based 
on data that was originally stored within the database as a 40 
Word file. 

Further, the entity requesting an outbound file may be on 
an entirely different platform, and using an entirely different 
protocol, than the entity that produced the file from which 
the outbound file is constructed. For example, assume that 45 
protocol server 4066 implements an IMAP4 server interface, 
and that protocol server 406a implements an HTTP server 
interface. Under these conditions, an e-mail document that 
originates from an e-mail application may be stored into the 
database through protocol server 4066, and retrieved from 50 
the database by a Web browser through protocol server 
406a. In this scenario, parsing unit 902 would invoke the 
parsing techniques associated with the e-mail. file type (e.g. 
RFC822), and rendering unit would invoke the rendering 
routines that construct an HTML document from the e-mail 55 
data retrieved from the database. 

Parser and Renderer Registration 
As mentioned above, the parsing techniques applied to a 
file are dictated by the type of the file. Similarly, the 60 
rendering techniques applied to a file are dictated by both the 
source type of the file and the target type of the file. The 
number of file types that exist across all computer platforms 
is enormous. Thus, it is not practical to build a parsing unit 
902 that handles all known file types, nor a rendering unit 65 
904 that handles all possible file- type to file -type conver- 
sions. 
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According to one embodiment of the invention, the prob- 
lem caused by the proliferation of file types is addressed by 
allowing type -specific parsing modules to be registered with 
parsing unit 902, and type-specific rendering modules to be 
registered with rendering unit 904. A type -specific parsing 
module is a module that implements the parsing techniques 
for a specific file type. For example, Word documents may 
be parsed using a Word Document parsing module, while 
POP3 e-mail documents are parsed using a POP3 e-mail 
parsing module. 

Similar to type-specific parsing modules, type-specific 
rendering modules are modules that implement the tech- 
niques for converting data associated with one or more 
source file types into one or more target file types. For 
example, a type-specific rendering module may be provided 
for converting Word documents into text documents. 

In some cases, conversion may be required even when the 
source and target file types are the same. For example, when 
parsed and inserted into the database, the contents of ao 
XML document may not be maintained in a single BLOB, 
but spread over numerous columns of numerous tables. In 
that case, XML is the source file type of that data, even 
though that data is no longer stored as an XML file. A 
type-specific rendering module may be provided to construct 
an XML document from that data. 

When an inbound file is received by parsing unit 902, 
parsing unit 902 determines the file type of the file and 
determines whether a type-specific parsing module has been 
registered for that file type. If a type-specific parsing module 
has been registered for that file type, then parsing unit 902 
calls the parsing routines provided by that type-specific 
parsing module. Those parsing routines parse the inbound 
file to generate metadata, which metadata is then stored into 
the database along with the file. If a type-specific parsing 
module has not been registered for the file type, then parsing 
unit 902 may raise an error or, alternatively, apply a generic 
parsing technique to the file. Because the generic parsing 
technique would not have any knowledge about the content 
of the file, the generic parsing technique would be limited 
with respect to the useful metadata it could generate for the 
file. 

When a file request is received by rendering unit 904, 
rendering unit 904 issues database commands to retrieve the 
data associated with the file. That data includes metadata 
that indicates the source file type of the file. Rendering unit 
904 then determines whether a type -specific rendering mod- 
ule has been registered for that source file type. If a 
type-specific rendering module has been registered for that 
source file type, then rendering unit 904 invokes the ren- 
dering routines provided by that type -specific rendering 
module to construct a file, and provides the file thus con- 
structed to the entity requesting the file. 

Various factors may be used to determine which target file 
type should selected by a type-specific rendering module. In 
some cases, the entity requesting the file may explicitly 
indicate the type of file it requires. For example, a text editor 
may only be able to handle text files. The text editor may 
request a file whose source file type is a Word Document. In 
response to the request, a Word -specific rendering module 
may be invoked which, based on the required target file type, 
converts the Word document to a text file. The text file is 
then delivered to the text editor. 

In other cases, the entity requesting the file may support 
numerous file types. According to one embodiment, the 
type-specific rendering module incorporates logic that (1) 
identifies a set of file types that are supported by both the 
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requesting entity and the type -specific rendering module, 
and (2) selects the best target file type in that set. The 
selection of the best target file type may take into account 
various factors, including the specific characteristics of the 
file in question. 

For example, assume that (1) DB file server 408 receives 
a request for a file, (2) the source file type for the file 
indicates that the file is a "BMP' image, (3) the request was 
initiated by an entity that supports "GIF', "TIF' and "JPG" 
images, (4) the BMP source type-specific rendering module 
supports target file types of "GIF", "JPG" and "PCX". Under 
these conditions, the BMP source type-specific rendering 
module determines that both "GIF" and "JPG" are possible 
target file types. To select between the two possible target 
file types, the BMP source type-specific rendering module 
may taking into account information about the file, including 
its resolution and color depth. Based on this information, the 
BMP source type-specific rendering module may determine 
that JPG is the best target file type, and then proceed to 
convert the BMP file into a JPG file. The resulting JPG file 
is then delivered to the requesting entity. 

According to one embodiment, type-specific parsing and 
rendering modules are registered by storing information in a 
database table that indicates the capabilities of the module. 
For example, the entry for a type -specific rendering module 
may indicate that it should be used when the source file type 
is XML and the requesting entity is a Windows-based Web 
Browser. The entry for a type-specific parsing module may 
indicate that it should be used when the source file type is a 
.GIF image. 

^bp.n ihe DB file server 408 receives a file-rela ted 
co mmand through , DB file API, the DB file server 408 
determines the file type at issue, and the identity of the entity 
that issued the command. OB file seivei 4f)8~ltien issues 
database commands to database server 204 which cause 
database server 204 to scan the table of registered modules 
to select the appropriate module to use under the current 
circumstances. In the case of an inbound file, the appropriate 
parsing module is invoked to parse the file before it is 
inserted into the database. In the case of an outbound file, the 
appropriate rendering module is invoked to construct the 
outbound file from data retrieved from the database. 
" — According to an embodiment ot the invention, the DBfile 
system allows file classes to be defined using object oriented 
techniques, where each file type belongs to a file class, and 
file classes can inherit attributes from other file classes. In 
such a system, the file class of a file may be a factor used in 
determining the appropriate parser and Tenderer for the file. 
The use of file classes shall be described in greater detail 
hereafter. 

Stored Query Directories 

As explained abo ve, a hierarchical directory structu re 
n53 yj>e impl emented in a database system using a files table 
TlpTwhcrc eacn row corresponds to a tile. A Hierarchica l 
index ^lCTmay be employed to efficiently locate the ro w 
associated with a specified file based on the pathname of t he 
file. 

In the embodiment illustrated in FIGS. 5 and 7, the child 
files of each directory are explicitly enumerated. In 
particular, the child files of each directory are enumerated in 
the Dir_entry__list of the index entry associated with the 
directory. For example, index entry 512 corresponds to the 
Windows directory 614, and the Dir_entry_list of index 
entry 512 explicitly enumerates "Word" and "Access" as the 
child files of Windows directory 614. 
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According to one aspect of the invention, a file system is 
provided in which the child files of some or all directories 
are not explicitly enumerated, but instead are dynamically 
determined based on the search results of stored queries. 
5 Such directories are referred to herein as stored query 
directories. 

For example, assume that a file system user desires to 
group all files with the extension ,doc into a single directory. 
With conventional file systems, the user would create a 

io directory, search for all files with the extension .doc, and 
then either move the files found by the search into the newly 
created directory, or create hard links between the newly 
created directory and the files found by the search. 
Unfortunately, the contents of the newly created directory 

15 only accurately reflects the state of the system at the time the 
search was performed. Files would remain in the directory 
if renamed to something that did not have the .doc extension. 
In addition, files with the .doc extension that are created in 
other directories after the new directory is established would 

20 not be included in the new directory. 

Rather than statically define the membership of the new 
directory, the membership of the directory may be defined 
by a stored query. A stored query that selects the files that 
have the extension .doc may appear as follows: 

25 Ql: 

SELECT*from files_table 
where 

files_table.Extension-"doc" 

30 Referring to FIG. 7, when executed against table 710, the 
query Ql selects rows R4 and R12, which are the rows for 
the two documents entitled "Example.doc". 

According to one embodiment of the invention, a mecha- 
nism is provided to link queries, such as query Ql, to 

35 directory entries in the hierarchical index 510. During the 
traversal of the hierarchical index 510, when a directory 
entry that contains such a link is encountered, the query 
identified by the link is executed. Each file selected by the 
query is treated as a child of the directory associated with the 

40 directory entry, just as if the file had been an explicit entry 
in the database table that stores directory entries. 

For example, assume that a user desires to create a 
directory "Documents" that is a child of Word 616, and 
desires the document directory to contain all files that have 

45 the extension .doc. According to one embodiment of the 
invention, the user designs a query that specifies the selec- 
tion criteria for the files that are to belong to the directory. 
In the present example, the user may generate query Ql. The 
query is then stored into the database system. 

50 Similar to other types of directories, a row for the Docu- 
ment directory is added to the files table 710, and an index 
entry for the Document directory is added to the hierarchical 
index 510. In addition, the Dir_entry_Jist of the index entry 
for the Word directory is updated to indicate that the new 

55 Document directory is a child of the Word directory. Rather 
than explicitly list children in a Dir„entry_list, the new 
directory entry for the Document directory contains a link to 
the stored query. 

FIGS. 10 and 11 respectively show the state of hierarchi- 

60 cal index 510 and files table 710 after the appropriate entries 
have been created for the Documents directory. Referring to 
FIG. 10, an index entry 1004 has been created for the 
Documents directory. Because the children of the Docu- 
ments directory are determined dynamically based on the 

65 result set of a stored query, the Dir__entry_list field of the 
index entry 1004 is null Instead of a static enumeration of 
child files, the index entry 1004 includes link to the stored 
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query 1002 that is to be executed to determine the child files 
of the Documents directory. 

In addition to the creation of index entry 1004 for the 
Documents directory, the existing index entry 514 for the 
Word directory is updated to indicate that Documents is a 5 
child of the Word directory. Specifically, a Dir_entry_list 
array entry is added to index entry 514 that identifies the 
name "Documents", the RowlD of the index entry for the 
Documents directory (i.e. Y7), and the FilelD of the Docu- 
ments directory (i.e. X13). 10 

In the illustrated embodiment, two columns have been 
added to the hierarchical index 510. Specifically, a Stored 
Query Directory (SQD) column contains a flag to indicate 
whether the directory entry is for a stored query directory. In 
the directory entries for stored query directories, a Query 15 
Pointer (QP) column stores a link to the stored queries 
associated with the directories. In directory entries for 
directories that are not stored query directories, the QP 
column is null. 

The nature of the link may vary from implementation to 20 
implementation. For example, according to one 
implementation, the link may be a pointer to the storage 
location at which the stored query is stored. According to 
another implementation, the link may simply be a unique 
stored query identifier that may be used to look up the stored 25 
query in a stored query table. The present invention is not 
limited to any particular type of link. 

Referring to FIG. 11, it illustrates files table 710 as 
updated to include a row (R13) for the Documents directory. 
According to one embodiment, the same metadata that is 30 
maintained for conventional directories is also maintained 
for the Documents directory. For example, row R13 may 
include a creation date, a last modification date, etc. 

FIG. 12 is a block diagram of a file hierarchy. The 
hierarchy shown in FIG. 12 is the same as that of FIG. 6, 35 
with the addition of the Documents directory 1202. When 
any application requests a display of the contents of the 
Documents directory 1202, the database executes the query 
associated with the Documents directory 1202. The query 
selects the files that satisfy the query. The results of the 40 
query are then presented to the application as the contents of 
the Documents directory 1202. At the time illustrated in 
FIG. 12, the file system only includes two files that satisfy 
the query associated with the Documents directory 1202. 
Those two files are both entitled Example.doc. Thus, the two 45 
Examples.doc files 618 and 622 are shown as children of the 
Documents directory 1202. 

In many OS file systems, the same directory cannot store 
two different files with the same name. Thus, the existence 
of two files entitled Examples.doc within Documents direc- 50 
tory 1202 may violate the OS file system conventions. 
Various techniques may be used address this issue. For 
example, the DB file system may append characters to each 
filename to produce unique filenames. Thus, Example.doc 
618 may be presented as Example .docl, while Example.doc 55 
622 is presented as Example. doc2. Rather than append 
characters that convey no particular information, the 
appended characters may be selected to convey meaning. 
For example, the appended characters may indicate the path 
to the directory in which the file is a statically located. Thus, 60 
Example.doc 618 may be presented as Example .doc_ 
Windows_Word, while Example.doc 622 is presented as 
Example doc_VMS_App 4. Alternatively, stored query 
directions may simply be allowed to violate the OS file 
system conventions. 65 

In the embodiment shown in FIG. 10, the child files of a 
given directory are either all statically defined, or all defined 
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by a stored query. However, according to one embodiment 
of the invention, a directory may. have some statically 
defined child files, and some child files that are defined by 
a stored query. For example, rather than having a null 
Dir_entry_list, index entry 1004 could have a Dir_entry_ 
list that statically specifies one or more child files. Thus, 
when the an application asks the database system to specify 
the children of the Documents directory, the database server 
would list the union of the statically defined child files and 
the child files that satisfy the stored query 1002. 

Significantly, the stored query that identifies the child files 
of a directory may select other directories as well as docu- 
ments. Some or all of those other directories may themselves 
be stored query directories. Under some circumstances, the 
stored query of a particular directory may even select the 
particular directory itself, causing the directory to be its own 
child. 

Because the child files of stored query directories are 
determined on-the-fly, a listing of the child files will always 
reflect the current state of the database. For example, assume 
that a "Documents" stored query directory is created, as 
described above. Every time a new file is created with the 
extension .doc, the file automatically becomes a child of the 
Documents directory. Similarly, if the extension of a file is 
changed from .doc to .txt, the file will automatically cease to 
qualify as a child of the Documents directory. 

According to one embodiment, the query associated with 
a. stored query directory may select certain database records 
to be the child files of the directory. For example, a directory 
entitled "Employees" may be linked to a stored query that 
selects all rows from an Employee table within the database. 
When an application requests the retrieval of one of the 
virtual employee files, a Tenderer uses the data from the 
corresponding employee record to generate a file of the file 
type expected by the requesting application. 

Stored Query Documents 

Just as stored queries may be used to specify the child files 
of a directory, stored queries may also be used to specify the 
contents of a document. Referring to FIGS. 7 and 11, they 
illustrate files table 710 with a Body column. For directories, 
the Body column is null. For documents, the Body column 
contains a BLOB that contains the document. For a file 
whose contents are specified by a stored query, the BODY 
column may contain a fink to the stored query. When an 
application requests the retrieval of a stored query 
document, the stored query that is linked to the row asso- 
ciated with the stored query document is executed. The 
content of the document is then constructed based on the 
result set of the query. According to one embodiment, the 
process of constructing the document from the query results 
is performed by a Tenderer, as described above. 

In addition to providing support for documents, whose 
contents are entirely dictated by the results of a stored query, 
support may also be provided for documents in which some 
portions are dictated by the results of a query, while other 
portions are not. For example, the Body column of a row in 
the document directory may contain a BLOB, while another 
column contains a fink to a stored query. When a request is 
received for the file associated with that row, the query may 
be executed, and the results of the query may be combined 
with the BLOB during the rendering of the file. 

Multiple-Level Stored Query Directories 

As mentioned above, a stored query may be used to 
dynamically select the child files of a directory. The child 
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files of a directory all belong to the same level in the file In yet another approach, files that are moved into a stored 
hierarchy (i.e. the level immediately below the directory query directory may be automatically modified so that they 
associated with the stored query). According to one satisfy the criteria of the stored query associated with the 
embodiment, the stored query associated with a directory directory. For example, assume that the stored query asso- 
may define multiple levels below the directory. Directories 5 ciated with a stored query directory selects all employees 
that are associated with queries that define multiple levels that are married. If a file that corresponds to an employee 
are referred to herein as multiple- level stored query direc- record is moved to that stored query directory the "married 
tor - es field of the employee record is updated to indicate that the 
, employee is married. 

For example, a ^multiple-level stored query directory may similarly, files that are moved out of a stored query 

be associated with a query that selects all employee records w dk&ciory may be automatically modified ^ that they cease 

in an employee table, and groups those employees records t0 satisfy me criteria of me stored query assoc i at ed with the 

by department and by region. Under these conditions, sepa- directory. For example, if a file in the "married employee" 

rate hierarchical levels may be established for each grouping stored query directory is moved out of the directory, then the 

key (department and region) and for the employee records. "married" field of the corresponding employee record is 

Specifically, the results of such a query may be presented as 15 up d a ted to indicate that the employee is not married, 

three different levels in the file hierarchy. The child files of When an attempt is made to move a file that does not 

the directory would be determined by the first grouping satisfy the criteria of a stored query into the corresponding 

criteria. In the present example, the first grouping criteria is stored query directory, another approach is to update the 

"department". Hence, the child files of the directory may be index entry for the stored query directory to statically 

the various department values: "Deptl", "Dept2" and 20 establish the file as a child of the stored query directory. 

"Dept3". These child files would themselves be presented as Under those circumstances, the stored query directory would 

directories. nave some child files that are child files because they satisfy 

The child files of the department directories would be < he st0K * ^ » nd otber c i» ld 61" that are child files 

determined by the second grouping criteria. In the present b °°™ e they haVe be6D m ° Ved t0 ** St ° red ^ 

example, the second grouping criteria is "region". Thus, 25 irecorv - 

each department directory would have a child file for each Programmatically Defined Files 

of the region values, such as "North", "South", "East", Stored query directories and stored query documents are 

"West". The region files would also be presented as direc- examples of programmatically defined files. Aprogrammati- 

tories. Finally, the child files of each region directory would cally defined file is an entity that is presented to the file 

be files that correspond to the particular department/region system as a file (e.g. a document or a directory), but whose 

combination associated with the region directory. For contents and/or child files are determined by executing code, 

example, the children of the \Deptl\East directory would be The code that is executed to determine the contents of the 

the employees that are in Department 1 in the East region. file may include a stored database query, as in the case of 

rt , . t , r „ stored query files, and/or other code. According to one 

Handling File Operations on the Child Files of a 35 embodiment? ihe code ass0 ciated with a programmatically 

Stored Query Directory defined me implements me followixxg routines: 

As mentioned above, the child files of a stored query resolve_filename( filename): child_file_handle; 

directory are presented to applications in the same manner as list_directory; 

the child files of conventional directories. However, certain ^ fetch; 

file operations that may be performed to the child files of t . 

conventional directories present special issues when per- delete* 

formed on the child files of a stored query directory. ^ resolve _ filename routine retumfi a file handle of a 
For example, assume that a user enters input that specifies file that has me name « filename" and is a child of the 
that a child file of a stored query directory should be moved 45 programmatically defined file. The list_directory routine 
to another directory. This operation presents a problem returns a of all child fi]es of the programm atically 
because the child file belongs to the stored query directory defined file. The fetch routine retrieves the contents of the 
by virtue of satisfying the criteria specified in the stored programmatically defined file. The put routine inserts data 
query associated with the directory. Unless the file is modi- int0 the programmatically defined file. The delete routine 
fied in a way that causes the file to cease to satisfy that 50 deletes the programmatically defined file, 
criteria, the file will continue to qualify as a child file of the According to one embodiment, a "resolve_pathname 
stored query directory. (path):file_handle" routing is also provided. The resolve_ 
A similar problem occurs when an attempt is made to pathname routine receives a path and ileratively calls the 
move a file into a stored query directory. If the file is not resolve_filename function for each filename in the path, 
already a child of the stored query directory, then the file 55 According to one embodiment, the DB file system pro- 
does not satisfy the stored query associated with the stored vides an object class that implements the above-listed rou- 
query directory. Unless the file is modified in a way that tines for conventional files (i.e. files that are not program- 
causes the file to satisfy the criteria specified by the stored matically defined). For the purpose of explanation, that 
query, the file should not be a child of the stored query object class shall be referred to herein as the "directory 
directory. 60 class". To implement a programmatically defined file, a 
Various approaches may be taken to resolve these issues. subclass of the directory class is established. The subclass 
For example, the DB file system may be configured to raise inherits the routines of the directory class, but allows the 
an error in response to operations that attempt to move files programmer to override the implementations of those rou- 
into or out of stored query directories. Alternatively, the DB tines. The implementations provided by the subclass dictate 
file system may respond to such attempts by deleting the file 65 the operations performed by the DB file system in response 
in question (or the database record that is being presented as to file operations involving the programmatically defined 
a file). file. 
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Event Notification within a File System tabfrf inVntihVs thft ontitv and ind icates the event in which the 

. « ■ . . „ , „ • ent ity is interested . The row may also include additional 

According to one aspect of the invention, a file system is . <?r* — . r rr : — rz ; : — r 

• j j • i_ *• i * *u information, suc h as th e prot ocol to use to communica te 

provided in which users are proactively notified upon the -V- — „ ~ — , ■ - — ■ — — 1 r — r 

r - . . f / t» a, . « wjilijheentity. The nag that indicates that a rule applies to 

occurrence of certain file system events. Because they are fr 1 -; 1 * , . — rrr— — — * j 

proactively notified, they need not incur the overhead of 5 th e directory may be stprcd in tte fi e. toble row associ ated 

repeated polling to detect conditions that indicate that the ^ Se aSSto * or both associated 

events of interest have occurred. The ability to be notified m c lrectorv > °^ _° 

upon the occurrence of a file system event is extremely WhSninserting a file into a directory, the database server 

useful, for example, when particular file system events have inspects the flag associated with the directory to determine 

significant meaning to a user. 10 whether any rules have been registered for that directory. If 

For example, it is common for multiple copies of a a ™J» h f bee ° registered for that directory, then the regis- 

document to be maintained at different locations ("cached") te «f rales L lai > le 18 ****** to find «f sfcific rules that 

to provide more efficient access to the document. Under to me Rectory. If the registered rules include rules 

these conditions, if one of the copies is updated, the remain- „ that a PPj v to ,he s f clfic °P«ation that is being performed 

ing copies are rendered stele (i.e. they no longer reflect the 15 on ,he ™ BC| ^L ! hen u messages are sent to the interested 

current state of the document). Using the event notification identified in those rules. The protocol used to send 

techniques described hereafter, when one copy is updated, < he messa S es «° the entities may vary from entity to entity, 

the sites at which the other copies reside can be proactively l°* e ^ am P le J, for c son f en,lhes «»e message may be sent via 

notified of the update. Processes or users at those sites may , n C u 01 * BA > w , hlle f°l° ther entltles m ^ be sent 1D 

then take whatever action is appropriate under the circum- 20 Me form of an H ™ L P a « e vla HTTP- 

stances. In the case of a cache, the appropriate action may According to one embodiment, the notification mecha- 

be, for example, to replace the cached version of the nism is implemented in conjunction with a database - 

document with the updated version. implemented file system, as described above, using a queu- 

As another example, a particular user may be responsible « in S mechanism such as the queumg mechanism described in 

for reviewing all of the technical documents of a company ^ 

before they are published. The technical writers of that RATUg AND METHOD FOR MESSAGE QUEUING IN A 

company may be instructed to store all technical documents DATABASE SYSTEM, filed by Chandra et. al. on Oct. 31, 

into a "ready for review" directory when they are ready for 1991 > entire contents of which are incorporated herein by 

review by that user. Without a proactive notification system, 30 reference. 

the mere storage of a technical document into the "ready for According to one such embodiment, an event server 
review" directory does not make the user aware that a new executing external to a database server is registered as a 
document is ready for review. Rather, some additional work subscriber to a queue managed by the database server. The 
would be required, such as the technical writer informing the queue to which the event server subscribes shall be referred 
user that the document is ready for review, or the user 35 to herein as the file event queue. Entities that are interested 
periodically checking the "ready for review" directory. In in particulafcfile system events register their interest with the 
contrast, with a file system that implements the event event server. The event server communicates with the data- 
notification techniques described herein, the act of placing a Dase server through the database API, and with the interested 
technical document into the "ready for review" directory * entities through the protocols supported by those entities, 
could trigger the generation of a message to the user to ^ _ When the database server performs an operation related tp f [ J 
notify the user that a new technical document is ready for the file system, th e database s erver places into the file even ty 1 ^ ^ 
review. queue a message that indicates ine event__id associated wit h 
According to one embodiment of the invention, rules may t he operation. The queuing mechanism determines that th e 
be defined for proactively generating messages for file e vent server has registered an interest in the file event que ue, 
system events. Such events include, for example, storage or 45 a nd transmits the message to the event server." The eve nt 
creation of files in a particular directory, deletions of files in s erver searches a list of interested entities to determin e 
a particular directory, movement of files out of a particular w hether any entity has registered an interest in the eve nt 
directory, modification or deletion of a particular file, and i dentified in the message. The event server then transm its_a 
linking a file to a particular directory. These file system message that indicates the occurrence of thefile syste m 
operations are merely representative. The specific operations 50 e vent to all entities that have registered an interest in the 
for which proactive notification rules may be created may evenL 

vary from implementation to implementation. The present In an embodiment that uses event servers to forward 

invention is not limited to providing event notification messages to interested entities, the event servers may be 

support for any particular set of file system operations. configured to support a certain maximum number of users. 

According to one embodiment, event_jds are assigned to 55 If the number of interested users exceeds the maximum, then 

file system events. Notification rules may then be created additional event servers are initiated to service the additional 

which specify an event_Jd and a set of one or more users. Similar to the single event server scenario, each event 

subscribers. Once a rule has been registered with the file server in a multiple event server system is registered as a 

system, the set of consumers identified^ in the rule are subscriber to the file event queue. 

automatically sent messages in response to the occurrence of so According to an alternative embodiment, the entities that 

the file system event identified by the event_Jd of the rule. are interested in file system events are directly registered as 

Forjejc ample, a user may register an interest in knowi ng subscribers to the file event queue. As part of the registration 

when files are added to a particular directory. To record this information, the entities indicate the event_ids of the file 

interest, the database Servef (1) inserts an~tow into a "reg- system events in which they are interested. When the 

isle red ru les" taple, and K^l) sets a Hag associated with the 65 queuing mechanism places a message in the file event queue, 

directory to indicate that at least one rule Has been registered the queuing mechanism does not automatically send the 

for the directory. The row inserted into the tegisteteo^rules message to all queue subscribers. Rather, the queuing 
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mechanism inspects the registration information to deter- project. For example, progra mmers located at numerou s 

mine which entities have registered an interest in the specific siTe s aroun"d the world may eacn be working on differe nt 

event associated with the message, and selectively sends the p ortions of the same computer program. The electron ic 

message to only those entities. In the case of entities that do do cuments that they generate for that computer progra m, 

not support the database API, the registration information 5 wnlch typically would; include source code files, belong t g_ a 

includes information about the protocol supported by those single pr oject. Thus, within the context of this discussi on, 

entities. The queuing mechanism transmits the file event proj ects a re collections of related files, 

messages to those entities using the protocols listed in their * Typically, the files of a project will be organized into 

registration information. specific folders. For example, FIG. 13 shows an example of 

File system event notification may be applied in a variety 10 how files related to a project "Big Project" may be organized 

of contexts. For example, at times it is desirable to store on into various folders. Referring to FIG. 13, a folder entitled 

a first machine a cache of files that reside on a second Big Project 1302 has been created to hold all files 

machine. One currently available mechanism to implement (directories and documents) related to the project. The 

such a file cache is the "briefcase" feature provided by immediate child files of Big Project 1302 are the folders 

Microsoft Windows operating systems. The briefcase fea- 15 source code 1304 and docs 1306. Source code 1304 includes 

ture allows users to create a special folder (a "briefcase") on two directories, LA code 1312 for storing the source code 

one machine, and copy into that briefcase files that are stored 1316 and 1318 of programmers located in Los Angeles, and 

on other machines. Each briefcase has an "update" option SF code 1314 for storing source code 1320 of programmers 

which, when selected, causes the file system to compare the located in San Francisco. Docs 1306 includes two folders: 

copy of the file that is in the briefcase with the copy of the 20 specs 1308 and user manual 1310. Specs 1308 includes spec , 

file that is in the original location. If the files do not have the 1322 and 1324. User manual 1310 includes UM 1326. 

same modification date, then the file system allows the user Frequently, files within a project will contain references 

to synchronize the two copies (typically by copying the ( e . g , HTML links) to other files within the same project, 

newer copy over the older copy). These references typically identify the other document using 

Unlike the briefcase mechanism, the file system event 25 the full pathname of the document. Consequently, if a 

notification mechanism allows a file cache to be proactively document is moved from one location in the directory 

updated so that it always reflects the current state of the files hierarchy to another, or the name of the document is 

at their original locations. For example, the process that changed, then all references to that document are rendered 

manages the file cache may register an interest in updates to invalid. 

the original copies of the files contained in the cache. Due to the existence of inter-document references, new 
Consequently, the process will automatically be informed versions of files are typically stored with the same name and 
when any of the original files are updated, and may imme- j n tne same location as the older versions that they are 
diately respond by copying the updated files into the file replacing. In conventional file systems, this process over- 
cache. Similarly, the file system event notification mecha- wr ites the older version of the file, making it irrecoverable, 
nism may be used to mirror on a first machine one or more Unfortunately, there are many circumstances in which it is 
directories that reside on a second machine. To use the file desirable to recover older versions of files. For example, 
system event notification mechanism in this manner, a critical information may have been inadvertently deleted 
process for maintaining the mirrored directories initially from the newer version. If the older version is irrecoverable, 
makes copies of the directories and all of the files contained then the user may have to spend significant resources to 
therein, and then registers its interest in changes made to the recreate the lost material, if it can be recreated at all. In 
directories and the files contained in the directories. When addition, it is often desirable to be able to reconstruct the 
informed that a change has been made to a directory, the change history for a file, to be able to determine when a 
process makes a corresponding change to the copy of the particular change was made, or to be able to determine what 
directory. Similarly, when informed of a change to any of the W as changed at a given point in time, 
files within the mirrored directories the process makes a 45 According to one aspect of me mven tion, a versioning 
corresponding change to the copy of the file. mechanism is provided in which new versions of files are 
For example, if a file moved from a directory that is savc ^ | n tne same location in the directory hierarchy using 
mirrored to a directory that is not mirrored, the process the same name as the older versions without overwriting the 
deletes the copy of the file from the mirrored directory, and 5Q 0 \fc T versions. Rather than overwrite the older versions, the 
unregisters its interest in the file. Thus, the process will not 0 id er versions are retained, and users can selectively retrieve 
continue to be notified when the file is updated. Similarly, if 0 \fe T versions of files. Further, the older versions are 
a file is moved from a directory that is not mirrored to a retained at their original locations in the directory hierarchy, 
directory that is mirrored, the process will be informed that As shall be described in greater detail hereafter, novel 
the directory has changed. In response to that message, the 5S directory versioning techniques are provided that allow the 
process identifies the new file, makes a copy of the new file fij e system to retain, at the same location within a directory 
in the mirrored directory, and registers its interest in the new hierarchy, multiple versions of the same file with the same 
file- name. 

Version Management in the File System Because the crea,ion of new versions does ** change the 

60 name or location of the original versions, any references to 

In the workplace, large assignm ents that involve m any a first version of a file continue to point to the first version 

p eople^wo r king toge ther f or c xtencTed periods of time a re of the file even when a newer version of the file is created, 

re ferred to as projects". While working on a proje ct, Thus, inter-file references contained within a document 

workers typically generate numerous_ j d oaiments, each of continue to point to the correct versions of the referenced 

which is in some way related to the project. 65 documents, even if newer versions of the referenced docu- 

iSimila&y , within ^ -comp uter syste m, use rs frequentl y ments have been created. The fact that inter-file references 

create numer ous electronic documents that all relate to a remain valid (i.e. continue to refer to the correct version of 
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the referenced files) during the versioning process has a To illustrate how the versioning mechanism responds to 

significant beneficial impact on the efficiency of file an update of a file that belongs to a project, assume that all 

retrieval. Specifically, rather than necessitating the perfor- files shown in FIG. 13 are version 1, and that an update is 

mance of a look-up operation to find the appropriate version performed to code 1320. As illustrated in FIG. 14, the 

of a referenced file, referenced files may be retrieved directly 5 versioning mechanism responds to the update by creating a 

by following references to them contained within other files, new version of code 1320' without deleting the original 

Similarly, the process of determining the contents of a version of the code 1320. Code 1320 belongs to SF code 

directory at a particular point in time need not involve directory 1314, so a new version of SF code directory 1314' 

look-up operations. Since directories are themselves is created without deleting the original version. SF code 

versioned, selection of a particular version of a directory 3Q directory 1314 belongs to source code directory 1304, so a 

implicitly selects the members of the directory. The selected new version of source code directory 1304' is created 

version of a directory will contain direct links to the correct without deleting the original version. Finally, source code 

files, and the correct version of the files, that belong to that directory 1304 belongs to big project directory 1302, so a 

version of the directory. new version of big project 1302' is created without deleting 

Techniques are also provided for tracking the relationship the original version, 

between versions of the same file even when the name of the As illustrated in FIG. 14, when a new version of a parent 

file changes from version to version. As shall be described fy 6 ^ created in response to a new version of a child file, the 

in greater detail hereafter, a FilelD and version number are new version of the parent file continues to have the same 

maintained for each version of each file, in addition to the children as it had before the update, with the exception that 

file's name. If two files have the same FilelD, they are ^ the new version of the updated file is its child, rather than the 

different versions of the same file even though they may original version of the updated file. For example, the new 

have different names. version of code 1320' is the child of the new version of SF 

According to one aspect of the invention, a mechanism is code 1314'. The new version of SF code 1314' is a child of 

provided to allow users to select the "vie w" of a project that the new version of source code 1304'. However, the 

they want to see. A view of a project presents the files of the 25 unchanged child files of the original source code 1304 (e.g. 

project as they existed at a particular point in time. For LA code 1312) continue to be child files of the new version 

example, the default view presented to users may present the of source code 1304'. Similarly, the new version of source 

most current version of all files. Another view may present code 1304' is the child of the new version of big project 

the version of the files that was current as of one day earlier. 1302', but the unchanged child files of the original big 

Another view may present the version of the files that was 3Q project (e.g. docs 1306) continue to be child files of the new 

current as of one week earlier. version of big project 1302. 

According to one embodiment, a version tracking mecha- In an embodiment in which the file system is implemented 

nism is provided by storing a version number with a each file using a hierarchical index, the index entry created for a new 

in a project. For example, in a file system implemented in a version of a directory would contain the same Dir__entry_ 

database system using a files table, such as files table 710, 35 list as the index entry for the previous version of the 

one column of the row associated with a file may store a directory, except that the array entry for the child file that 

version number for the file. Whenever a file is created, a row was updated is replaced with an array entry to the new 

for the file is inserted into the files table 710, and a version of the child file. If the updated child file was a child 

predetermined initial version number (e.g. 1) is stored in the directory, then the Dir_entry_list array entry for the new 

version column of that row. 40 directory would include the RowID, within the hierarchical 

When the file is updated, the previous version of the file index, of the index entry for the new version of the child 

is not overwritten. Rather, a new row is inserted in the files directory. 

table for the new version of the file. The row for the new When a file that belongs to a project is moved from one 

version contains the same Fileld, Name, and Creation Date directory in the project to another directory in the project, 

as the original row, but includes a higher version number 45 the file itself has not been changed, so a new version of the 

(e.g. 2), a new Modification Date, and possibly a different file is not created. However, the directory from which the file 

file size, etc. In addition, the BLOB that stores the content was moved, and the directory into which the file was placed, 

of the file will reflect the update, while the BLOB of the have both been changed. Consequently, new versions are 

original entry remains unchanged. created for those directories and all ancestor directories of 

According to one embodiment, when a file and the 50 those directories that are in the same project. FIG. 15 

directory in which the file resides both belong to a project, illustrates the new directories that would be created in 

then a change to the file effectively creates a new version of response to code 1318 of FIG. 13 being moved from LA 

the directory. Consequently, a update to a file in a directory code 1312 to SF code 1314. Specifically, new versions of LA 

will not only cause the creation of a files table row for the code 1312' and SF code 1314' would be created. The new 

new version of the file, but will cause the creation of a files 55 version of LA code 1312' would not have code 1318 as its 

table row for the new version of the directory. In an child. Rather, code 1318 would be the child of the new 

embodiment that uses a hierarchical index, an index entry version of SF code 1314'. Anew source code directory 1304' 

for the new version of the directory would also be added to is created and finked to the new versions of LA code 1312' 

the hierarchical index. and SF code 1314'. A new big project directory 1302' is 

If both a directory and the parent directory belong to the 60 created and linked to the new source code directory 1304', 

same project, then the creation of a new version of the and to the original docs directory 1306. 

directory effectively creates a new version of the parent Using the versioning technique described above, a new 

directory. Consequently, new rows are also added to the files version of the root directory of a project (e.g. big project 

table and hierarchical index for the parent directory of the 1302) is created after every change to the project. The links 

directory. This process continues, causing new versions to 65 that descend from each version of the root project directory 

be created for all directories that belong to a project and that link together all files that belonged to the project at a 

reside above an updated file in the file hierarchy. particular point in time, and the versions of the files thus 
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linked are the versions that existed at that particular point in be retained for any given file. If a new version is created for 
time. For example, referring to FIG. 14, the links descending a file which is already at the purge count number of versions, 
from big project 1302 reflect the project as it existed prior to the new version of that file overwrites the oldest retained 
the update to code 1320. The links descending from big version of that file. A purge count may be implemented on 
project 1302' reflect the project as it existed immediately 5 a per-file system, per-project, or per-file basis. When imp le- 
afier the update to code 1320. Similarly, in FIG. 15, the links mented on a per-file system basis, a single purge count 
descending from big project 1302 reflect the project as it applies to all files maintained in the file system. On a 
existed prior to moving code 1318 from LA code 1312 to SF per-project basis, all files in a given project have the same 
code 1314. The links descending from big project 1302' purge count, but different projects may have different purge 
reflect the project as it existed immediately after moving 10 counts. On a per-file basis, a different purge count may be 
code 1318 from LA code 1312 to SF code 1314. specified for each file. 

. When used in combination with tagging, the purge count 

Tagging mechanism may be implemented in a variety of ways. 

Unfortunately, the versioning technique described above According to one embodiment, tagged files are ignored for 

causes a significant proliferation of file versions, particularly 15 the purpose of determining whether creating a new version 

of the directories that are at higher levels of a project. Under of a file would exceed the purge count, and tagged files are 

some conditions, this proliferation may be both unnecessary never deleted by the purge count mechanism. For example, 

and undesirable. Therefore, according to one embodiment of assume that the purge count for a file is five, that five 

the invention, a mechanism is provided for "tagging" ver- versions of the file exist, and that one of those five versions 

sions of files. Tagging a version of a file indicates that that 20 is tagged. When an update is made to the file, the purge 

version of the file should be retained. Thus, rather than count mechanism determines that there are currently only 

always retaining older version of files when newer versions four existing non-tagged versions of the file, and therefore 

are created, older versions of files are retained only if they creates another version of the file without deleting any of the 

have been tagged. Otherwise, they are replaced existing versions. If the same file is updated again, then the 

(overwritten) when newer versions are created. 25 purge count mechanism determines that there are five exist- 

Referring to FIG. 13, assume that code 1320 has not been in g non-tagged versions of the file, and therefore deletes the 

tagged. If code 1320 is updated, the new version of the code oldest non-tagged version of the file m response to creating 

merely replaces the old version of the code. Only if code a new version. 

1320 has been tagged are separate new versions made of PrnWt T ink* 

code 1320, SF code 1314, source code 1304 and big project 30 inier-rrojeci unxs 

1302, as illustrated in FIG. 14. Each link has a source file (the file from which the link 

Under many circumstances, tags will be applied to all files extends) and a target file (the file to which the link points), 

within a project at the same time. For example, if a particular In the file hierarchy, the source file of a link is frequently a 

version of a software program is released, all of the source 35 directory, while the target file of the link is a file within the 

code used to create the released version of the program may directory. However, not all links are between directories and 

be tagged at that point in time. Consequently, the exact set their children. For example, an HTML file may include 

of source code associated with the released version will be hyperlinks to graphic images and to other HTML files. In a 

available for later reference regardless of subsequent revi- file system implemented using a hierarchical index, those 

sions to the source code files. 4Q hyperlinks may be handled in the same manner as directory- 

In an embodiment where tags are always applied to a to-document links, 

project as a whole, a single tag may be maintained for the A view of the file system shows how each project in the 

root project directory. If a file is located using a version of file system existed at a particular point in time. However, the 

the root project directory that is tagged, then any change to point in time associated with one project in a view may be 

that file will cause a new version of the file to be created 45 different than the point in time associated with another 

while the original version of the file is retained. If, on the project in the same view. This creates a problem when the 

other hand, a file is located using a version of the root project source file of a link belongs to a different project than the 

directory that is not tagged, then any change to that file will target file of the link. For example, assume that a view 

merely overwrite the previous version of the file. specifies a time Tl for a project PI that includes a file Fl, 

According to another embodiment, applying a tag to a file 5 0 and a later time 72 for a P r °j ect P 2 that includes a file F2. 

effectively applies a tag to all files that reside below that file Assume further that file F2 has a link to file Fl. The link 

in the file hierarchy. For example, assume that a tag is contained in the T2 version of F2 will go to the T2 version 

applied to LA code 1312. If code 1318 is moved out of LA of P1 > 1101 the T1 version of pl However, because the view 

code 1312, then a new version of LA code 1312 is created. specifies Tl for PI, the Tl version of PI should be used for 

If code 1318 is updated, then new versions of both code ss an y operations performed on any files in PI through the 

1318 and LA code 1312 are created. In such an embodiment, view. 

if a file is located by traversing the file hierarchy through any According to one embodiment of the invention, an "inter- 
tagged file, then any change to that file causes a new version project boundary" flag is maintained for each link. The 
of the file to be created. If a file is located without traversing inter-project boundary flag of a link indicates whether the 
any file in the hierarchy that is tagged, then any change to eo source file and the target file of the link are in the same 
that file overwrites the previous version of the file. project. In a file system that uses a hierarchical index, such 

as hierarchical index 510, an inter-project boundary flag may 

Purge Count oe stored, for example, in each array entry of an index 

Another technique for reducing the proliferation of entry's Dir_entry_list. 

versions, which may be employed instead of or in addition 65 During the traversal of the file hierarchy, the inter-project 

to tagging, involves maintaining a purge count. A purge boundary flag of every link is inspected before the link is 

count indicates the maximum number of versions that will followed. If the inter-project boundary flag of a link is set, 
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then the required version time of the project to which the 
source side file belongs is compared to the required version 
time of the project to which the target side file belongs. If the 
desired version time is the same, then the link is traversed, 
[f the desired version time is not the same, then a search is 5 
performed for the version of the target file that corresponds 
to the required version time of the project to which the target 
side file belongs. 

For example, the inter-project boundary flag of the link 
between F2 and Fl would be set. Consequently, a compari- 10 
son is made between the required version time of P2 and the 
required version time of PI. The required version time of P2 
is T2, which is not the same asTl, the required version time 
of PI. Therefore, PI would not be located by following the 
link. Rather, a search would be performed to locate the 15 
version of PI that corresponds to time Tl. 

According to an alternative embodiment, no inter-project 
boundary flags are maintained. Instead, ever time a link is 
encountered, the required version time of the source file is 
compared to the required version time of the target file. If the 20 
source and target files are in the same project, or if they are 
in different projects that have the same required version 
times, then the link is followed. Otherwise, a search is 
performed to find the correct version of the target file. 

Object-Oriented File System 

In recent years, object oriented programming has become 
the standard programming paradigm. In object oriented 
programming, the world is modeled in terms of objects. An 3Q 
object is a record combined with the procedures and func- 
tions that manipulate it. All objects in an object class have 
the same fields ("attributes"), and are manipulated by the 
same procedures and functions ("methods"). An object is 
said to be an "instance" of the object class to which it 35 
belongs. 

Sometimes an application requires the use of object 
classes that are similar, but not identical. For example, the 
object classes used to model both dolphins and dogs might 
include the attributes of nose, mouth, length and age. ^ 
However, the dog object class may require a hair color 
attribute, while the dolphin object class requires a fin size 
attribute. 

To facilitate programming in situations in which an appli- 
cation requires multiple similar attributes, object oriented 45 
programming supports "inheritance". Without inheritance, a 
programmer would have to write one set of code for the dog 
object class, and a second set of code for the dolphin object 
class. The code implementing the attributes and methods 
common to both object classes would appear redundantly in 50 
both object classes. Duplicating code in this manner is very 
inefficient, especially when the number of common 
attributes and methods is much greater than the number of 
unique attributes. Further, code duplication between object 
classes complicates the process of revising the code, since 55 
changes to a common attribute will have to be duplicated at 
multiple places in the code in order to maintain consistency 
between all object classes that have the attribute. 

Inheritance allows a hierarchy to be established between 
object classes. The attributes and methods of a given object 60 
class automatically become attributes and methods of the 
object classes that are based upon the given object class in 
the hierarchy. For example, an "animal" object class may be 
defined to have nose, mouth, length and age attributes, with 
associated methods. To add these attributes and methods to 65 
the dolphin and dog object classes, a programmer can 
specify that the dolphin and dog object classes "inherit" the 
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animal object class. Under these circumstances, the dolphin 
and dog object classes are said to be "subclasses" of the 
animal object class, and the animal object class is said to be 
the "parent" class of the dog and dolphin object classes. 

According to one aspect of the invention, a mechanism is 
provided for applying the object-oriented paradigm, includ- 
ing inheritance, to a file system. Specifically, each file in the 
file system belongs to a class. The class of a file system 
determines, among other things, the type of information that 
the file system stores about the file. According to one 
embodiment, a base class is provided. Users of the file 
system may then register other classes, which may be 
defined as subclasses of the base class or any previously 
registered class. 

When new file classes are registered with the file system, 
the file system is effectively extended to support new types 
of files, and interaction with new types of file systems. For 
example, most e-mail applications expect e-mail documents 
to have a "priority" property. If a file system does not 
provide storage for the priority property, then the e-mail 
applications may not operate properly with e-mail docu- 
ments stored in that file system. Similarly, certain operating 
systems may expect certain types of system information to 
be stored with a file. If the file system does not store that 
information, the operating systems may encounter problems. 
By registering a class that includes all of the attributes 
required to support a particular type of system or protocol 
(e.g. specific operating systems, FTP, HTTP, 1MAP4, etc) 
accurate and transparent interaction with that system or 
protocol becomes possible. 

To register a class, information is provided about the 
class, including data that identifies the parent class of the 
class and describes any attributes that the class has that the 
parent class does not have. The information may also specify 
specific methods that operate on instances of the class. 

An object-oriented file system that allows users to register 
file classes, supports inheritance between file classes, and 
stores information about the files based on the class to which 
they belong may be implemented in a variety of ways 
depending on the context in which the file system itself is 
implemented. According to one embodiment, an object- 
oriented file system is provided in the context of a database- 
implemented file system, as described above. However, 
while various aspects of the object-oriented file system shall 
be described relative to a database-implemented 
embodiment, the object oriented file system techniques 
described herein are not limited to such an embodiment. 

Database -Implementation of Object Oriented File 
System 

According to one embodiment, a database-implemented 
file system provides a base class, and allows subclasses of 
the base class to be registered with the file system. Referring 
to FIG. 16, it illustrates an exemplary set of file classes. The 
base class is entitled "Files" and includes attributes that are 
generally common to all files, including name, creation date, 
and modification date. Similarly, the methods of the Files 
, class include methods for operations that may be performed 
on all files. 

According to one embodiment, the attributes of the Files 
class is the union of all attributes maintained by the oper- 
ating systems with which the database -implemented file 
system will be used. For example, assume that the file 
system is implemented in a database managed by server 204 
as shown in FIG. 3. The files stored in the file system 
originate from operating systems 304a and 3046, which do 
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not necessarily support the same set of file attributes. 
Consequently, the set of attributes of the Files class of the 
file system implemented by database server 204 would be 
the union of the sets of attributes supported by the two 
operating systems 304a and 304b. 5 

According to an alternative embodiment, the attributes of 
the Files class is the intersection of all attributes maintained 
by the operating systems with which the database- 
implemented file system is used. In such an embodiment, a 
subclass of the Files class could be registered for each 10 
operating system. The subclass registered for a given oper- 
ating system would extend the base Files class by adding all 
of the attributes supported by that given operating system 
that are not already included in the base Files class. 

In the embodiment illustrated in FIG. 16, two subclasses 15 
of the Files class have been registered: a "Document" class 
and a "Folder" class. The Document class inherits all of the 
attributes and methods of the Files class, and adds attributes 
that are specific to document files. In the illustrated 
embodiment, the Document class adds the attribute "size". 20 

The Folder class inherits all of the attributes and methods 
of the Files class and adds attributes and methods that are 
specific to folder files (i.e. files, such as directories, that are 
able to contain other files). In the illustrated embodiment, 
the Folder class introduces a new attribute "max_children" 25 

and a new method "dir list". The max_children attribute 

may, for example, indicate the maximum number of child 
files that may be contained in a given folder. The "dir_Jist" 
method may, for example, provide a listing of all of the child 
files of a given folder. 30 

In the class hierarchy illustrated in FIG. 16, the Document 
class na^-twe--re^stered-s ubclas ses. "e^mail and Text/B o th 
silt»clast>fa& i nhei it ~air~oflEe attributes and methods ofl he 
Document class. In addition, the e-m ail class includes th ree 
a dditional properties: rea d_rlag, priority, and sender. The 
T ext class has one additional attribut e rCRZIFlag, andlm 
ad ditional method. Type. The CR_Flag may be a fla g to 
inrnntt r^vhrthfr t h f? kxt H nnimrnt rnn|Tmi_^2irp^ 
rrtu nV^Tnbolfi Thtr Typn m rthr ^ o jajmlaJhe jexTSo cu- 
pnt to anJ/O-device, ?j\ich_asacnrnputer monitor. 40 

File Class and File Format 
% mtPTna| gfmr ture of a file is referred to as th e 
"format" of the file. Typically, thexorma t^fXnTrjsaictated 
fry* ~tKe application that creates Tile rfle. For example 7~a 45 
document created by one word processor may have the same 
s emantic content but an enUrely~^ifferent format than 
aflrJtner document created by a different word processor. In 
some" tile systems' a mapping is maintained between docu - 
m ent formats and filename extensions. For example, all fi les 50 
t hat have filenames ending in .doc are presumed to be tile s . 
created by a particular word processor, and therefore are' 
p resumed to have the internal stru cture imposed by tKat 
word processor. In other file systems, information about the 
fo rmaTof document is maintained in a separate meta file 55 
as sociated with the document.. 

I n contrast to file formats, t he file class mechanism 
desc ribed herein floes hoi relate to the internal structure of 
a*"docum ent. Rather, the file class of a file dictates~wh at 
i nformation the rile system maintains for the file, and wha t 60 
operations the tile system can pertorm o n the tile. F or 
e xample, document s crea ted by numerous word process ors 
may all be instances ot the Document class . Consequently, 
th e^nTTsystem would maintain the same attribute inform T- 
tijn about th e documents, and allow the same operations to 65 
he pe rformed nn the documents^ even muugll"thg j5teffial 
structures of the documents are completely different. 
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According to one embodiment, an object-oriented file 
system is implemented in a relational database system where 
a relational table is created for each class of file. FIG. 17 is 
an example of the tables that may be created for the classes 
illustrated in FIG. 16. Specifically, Files table 1702, Docu- 
ment table 1704, E-mail table 1706, Text table 1708 and 
Folder table 1708 respectively corresponds to the Files class, 
Document class, E-mail class, Text class and Folder class. 

According to one embodiment, the class table for a given 
class includes rows for (1) files that belong to that given 
class, and (2) files that belong any descendant class of that 
given class. For example, in the illustrated system, the Files 
class is the base class. Consequently, every file in the file 
system will be a member of the Files class or a descendant 
class thereof. Therefore, the Files table will include rows for 
all files in the file system. On the other hand, the E-mail class 
and the Text class are descendents of the Document class, 
but the Files class and the Folder class are not. Therefore, the 
Document table 1704 includes rows for all files of class 
Document, E-mail or Text, but not for files that are of class 
Files or Folder. 

The table for each class includes columns to store values 
for the attributes that are introduced by that class. For 
example, the Document class inherits the attributes of the 
Files class, and adds to those attributes the size attribute. 
Therefore, the Document table includes a column for storing 
a size value for the size attribute. Similarly, the E-mail class 
inherits the attributes of the Document class and introduces 
the read flag, priority, and sender attributes. Consequently, 
the E-mail table 1706 includes columns for storing read_ 
flag values, priority values, and sender values. 

Five files are stored in the file system illustrated in FIG. 
17. The file named Filel is stored at RowID XI in Files table 
1702. The FilelD of Filel is Fl. The class of Filel is the File 
class, as indicated by the value stored in the Class column 
of row XI. Because Filel is an instance of the Files class, the 
Files table 1704 is the only class table that contains infor- 
mation for Filel. Thus, the only attribute values stored for 
Filel are values for the attributes associated with the Files 
class. 

The file named File2 is stored at RowID X2 in Files table 
1702. The FilelD of File2 is F2. The class of File2 is the 
Document class, as indicated by the value stored in the Class 
column of row X2. Because File2 is an instance of the 
Document class, the Files table 1702 and Document table 
1704 contain information for File2. Thus, the attribute 
values stored for File2 are values for the attributes associ- 
ated with the Documents class, including those attributes 
inherited from the Files class. 

The file named File3 is stored at RowID X3 in Files table 
1702. The FilelD of File3 is F3. The class of File3 is the 
E-mail class, as indicated by the value stored in the Class 
column of row X3. Because File3 is an instance of the 
E-mail class, the Files table 1702, the Document table 1704 
and the E-mail table 1706 all contains information for File3. 
Thus, the attribute values stored for File3 are values for the 
attributes associated with the E-mail class, including those 
attributes inherited from the Document and Files classes. 

The file named File4 is stored at RowID X4 in Files table 
1702. The FilelD of File4 is F4. The class of File4 is the Text 
class, as indicated by the value stored in the Class column 
of row X4. Because File4 is an instance of the Text class, the 
Files table 1702, the Document table 1704 and the Text table 
1708 contain information for File4. Thus, the attribute 
values stored for File4 are values for the attributes associ- 
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ated with the Text class, including those attributes inherited 
from the Document and Files classes. 

The file named FileS is stored at RowID X5 in Files table 
1702. The FilelD of FileS is F5. The class of FileS is the 
Folder class, as indicated by the value stored in the Class 
column of row X5. Because FileS is an instance of the Folder 
class, the Files table 1702 and the Folder table 1708 contain 
information for FileS. Thus, the attribute values stored for 
FileS are values for the attributes associated with the Folder 
class, including those attributes inherited from the Files 
class. 

According to one embodiment of the invention, the files 
within the class tables are accessed by traversing a hierar- 
chical index, as described above with reference to FIGS. 5 
and 8. A traversal of the hierarchical index (as is performed 
during pathname resolution) produces the RowID of the row 
within Files table 1702 that corresponds to a target file. From 
that row, attribute values for the Files class attributes may be 
retrieved. However, for files that belong to other classes, 
additional attributes may have to be retrieved from other 
class tables. For example, for File3 the creation and modi- 
fication dates may be retrieved from row X3 of Files table 
1702. However, to retrieve the size of File3, row Y2 of 
Document table 1704 must be accessed. To retrieve the 
priority information for File3, row Ql of E-mail table 1706 
must be accessed. 

To facilitate the retrieval of the various attribute values 
that belong to a file, the rows containing those attributes are 
linked to each other. In the illustrated embodiment, the links 
are stored in columns labeled "Derived RowID". The value 
stored in the Derived RowID column of a row for a 
particular file in a table for a particular class points to the 
row for that particular file that resides in a table for a 
subclass of that particular class. For example, the Derived 
RowID column of the Files table row X3 for File3 contains 
the value Y2. Y2 is the RowID of the row for File3 in the 
Document table 1704. Similarly, the Derived RowID col- 
umn of the Document row Y2 contains the value Ql. Ql is 
the RowID of the row for File3 in the E-mail table 1706. 

In the illustrated embodiment, the links between the rows 
for a particular file are unidirectional, going from the row in 
the table for a parent class to the row in the table of a 
subclass. These unidirectional links facilitate searches that 
start with rows in the base table (i.e. the files table), which 45 
under most conditions will be the case. However, if the 
starting point of a search is the row of another table, the 
related rows in the parent class tables cannot be located by 
the links. To find those related rows, a search of those tables 
may be performed based on the FilelD of the file of interest. 50 

For example, assume that a user has retrieved row Y2 of^ 
Document table 1704, and desires to retrieve all of the other 
attribute values for File3. The row containing the E-mail- 
specific attribute values may be found by following the 
pointer in the Derived RowID column of row Y2, which 
points to row Ql in E-mail table 1706. However, to find the 
remaining attributes, the Files table 1702 is searched based 
on the FilelD F3. Such a search would find row X3, which 
contains the remaining attribute values of File3. 

According to an alternative embodiment, the links 
between related rows may be implemented in a way that 
allows all related rows to be located without a FilelD 
lookup. For example, each class table may also have a Parent 
RowID column that contains the RowID of the related row 
in a parent class table. Thus, the Parent RowID column for 
row Y2 of Document table 1704 would point to row X3 in 
the Files table 1702. Alternatively, the last row in the chain 
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of unidirectional links may include a pointer back to the 
related row in the Files table. Yet another alternative 
involves establishing, for each class table, a column that 
includes a pointer back to the related row in the Files table.- 
Thus, row Rl of Text table 1708 and row Y3 of Document 
table 1704 would both include pointers back to row X4 of 
Files table 1702. 

Subclass Registration 

As mentioned above, a mechanism is provided for extend- 
ing the class hierarchy of the file system by registering new 
classes. In general, the information provided during the class 
registration process includes data that identifies the parent 
class of the new class, and data that describes attributes that 
are added by the new class. Optionally, the data may also 
include data used to identify new methods that can be 
performed on instances of the new class. 

The registration information may be provided to the file 
system using any one of numerous techniques. For example, 
a user may be presented with a graphical user interface that 
includes icons representing all of the registered classes, and 
the user may operate controls presented by the user interface 
to (1) select one of the classes as the parent of a new class, 
(2) name the new class; (3) define additional attributes for 
the new class, and (4) define new methods that may be 
performed on the new class. Alternatively, a user may 
provide to the file system a file containing the registration 
information for a new class. The file system parses the file 
to identify and extract the information, and builds a class file 
for the new class based on the information. 

According to one embodiment of the invention, the clzfr 
r egistration information is provided to the file system in the 
form of an Extensible M arku p Language (XML) file. The 
XML format i s des cribed in detail at www.oasis-open.or fi/ 
c over/xml.html#contents and at th e sites listed there. In 
general, the XML language includes tags that name fields 
and mark the beginnings and ends of fields, and values for 
those fields. For example, an XML document containing 
registration information for the "Folder*' file class may 
contain the following information: 

<typename> 

folder 

</typename> 

<inherits_from> 

files 

</inherits_from> 

</dbi__classname> 

my_folder__methods 

</dbi__classname> 

<prop_def> 

<name> 

max_children 

</name> 

<type> 

integer 

</type> 
</prop_d e£> 

"in response to receiving this file class registration 
document, the file system creates a table for the new class 
Folder. The new table thus created includes a column for 
each of the attributes defined in the registration information. 
In the present example, only the max_children attribute is 

defined. The data type specified for the max children 

attribute is "integer". Consequently, the Folder table is 
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created with a max_children column that holds integer 
values. In addition to the name and type of an attribute, 
various other information may be provided for each 
attribute. For example, the registration information may 
indicate a range or maximum length for attribute values, and 5 
whether the column should be indexed or subject to a 
uniqueness or referential constraint. 

The registration information also includes information 
about any methods supported by the new file class. Accord- 
ing to one embodiment, the new methods are specified by 10 
identifying a file that contains the routines associated with 
those methods. According to one embodiment, the routines 
associated with each file class are implemented in a JAVA 
class. If a first file class is a subclass of a second file class, 
then the JAVA class that implements the methods associated 15 
with the first file class is a subclass of the JAVA class that 
implements the methods of the second file class. 

In the XML example given above, the dbi classname 

field of the registration information specifies a JAVA class 
file for the Folder file class. Specifically, the registration 20 
information provides the filename "my_folder__methods" 
for the dbi_classname field to indicate that the my_folder_ 
methods JAVA class implements the routines for the non- 
inherited methods of the Folder class. Because the Folder 
class is a subclass of the Files class, the my_folder_ 25 
methods class would be a subclass of the JAVA class that 
implements the methods for the Files class. Thus, the 
my__folder_methods class would inherit the Files methods. 

In addition to defining new methods that are not supported 
by a parent file class, the routines for a child file class can 30 
override the implementation of methods defined in the 
parent class. For example, the Files class illustrated in FIG. 
16 provides a "store" method. The Folder class inherits the 
store method. However, the implementation of the store 
method provided for the Files class may not be the iraple- 35 
mentation required to store folders. Therefore, the Folder 
class may provide its own implementation of the store 
method, thus overriding the implementation provided by the 
Files class. 

Determining the Class of a File 

When the file system is asked to perform an operation on 
a file, the file system invokes the routines that implement the 
requested operation for the particular class of file to which 
the file belongs. As mentioned above, that same operation 45 
may be implemented differently for different file classes 
when, for example, a subclass has overridden the imple- 
mentation provided by its parent class. Thus, to ensure that 
the proper operation is performed, the file system must first 
identify the class of the file upon which the operation is to 50 
be performed. 

For files already stored in the file system, the task of 
identifying the class of the files may be trivial. For example, 
in the embodiment illustrated in FIG. 17, the Files table 1702 
includes a Class column that, for any given row, stores data 55 
indicating the class of file associated with that row. Thus, if 
a request is received for performing a "move" operation on 
File3, the Class column of row X3 may be inspected to 
determine that File3 is of type E-mail. Consequently, the 
E-mail implementation of "move" should be executed. The eo 
E-mail implementation of "move" would be the implemen- 
tation provided for the E-mail file class if the E-mail file 
class overrides the implementation of its inherited "move" 
method. Otherwise, the E-mail implementation of "move" is 
the implementation that is inherited by the E-mail class. 65 

Hie task of identifying the class of a file may be more 
difficult when the file is not already stored in the file system. 



For example, when the file system is asked to store a file that 
is not already in the file system, the file system cannot make 
the class determination by inspecting the files table. Under 
these conditions, various techniques may be used to identify 
the type of the file. According to one embodiment, the type 
of the file may be expressly provided in the file operation 
request. For example, if the request is made in response to 
a command issued through the command-line of an operat- 
ing system, one of the command -line arguments may be 
used to indicate the file type of the file. For example, the 
command may be entered as: "move a;\mydocs\file2 
c:\yourdocs/class=document". 

Another technique for determining the class of a file 
involves determining the class based on information con- 
tained in the name of the file. For example, all files with 
certain extensions (e.g. .doc .wpd .pwp, etc.) may all be 
treated as members of a particular file class (e.g. Document). 
Consequently, when the file system is asked to perform 
operations on those files, the method implementations asso- 
ciated with that particular file class are used. 

Yet another technique for determinin g the clas s of a fil e 
i nvolves determining the class based on the location of The 
fi le within the file system hierarchy. For example, all file s 
c reated within a particular directory or set of directories ma y 
b e presumed to belong to a particular fi le class, regardless o f 
h ow the files are named. These and other techniqu es may be 
com bined in a variety of ways. For example, a file wi th a 
p articular extension may be treated as a member of a first 
class unless the file is stored in a directory associated with 
a second class. If the file is stored in the directory associated 
with the second class, then the file is treated as a member of 
the second class unless the file operation request explicitly 
identifies t he file to be a memb er_oLanother file class. 

Hardware Overview 

FIG. 18 is a block diagram that illustrates a computer 
system 1800 upon which an embodiment of the invention 
may be implemented. Computer system 1800 includes a bus 
1802 or other communication mechanism for communicat- 
ing information, and a processor 1804 coupled with bus 
802 for processing information. Computer system 1800 
:o includes a main memory 1806, such as a random access 
lemory (RAM) or other dynamic storage device, coupled to 
us 1802 for storing information and instructions to be 
;xecuted by processor 1804. Main memory 1806 also may 
e used for storing temporary variables or other intermediate 
information during execution of instructions to be executed 
by processor 1804. Computer system 1800 further includes 
a read only memory (ROM) 1808 or other static storage 
device coupled to bus 1802 for storing static information and 
instructions for processor 1804. A storage device 1810, such 
as a magnetic disk or optical disk, is provided and coupled 
to bus 1802 for storing information and instructions. 

Computer system 1800 may be coupled via bus 1802 to a 
display 1812, such as a cathode ray tube (CRT), for dis- 
playing information to a computer user. An input device 
1814, including alphanumeric and other keys, is coupled to 
3us 1802 for communicating information and command 
elections to processor 1804. Another type of user input 
1 lev ice is cursor control 1816, such as a mouse, a trackball, 
or cursor direction keys for communicating direction infor- 
nation and command selections to processor 1804 and for 
mtrolling cursor movement on display 1812. This input 
vice typically has two degrees of freedom in two axes, a 
fiVst axis (e.g., x) and a second axis (e.g., y), that allows the 
vice to specify positions in a plane. 
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, The invention is related to the use of computer syste m 
Ig OjO foe Jmplementing the techniques describe d herein - 
According to one embodiment of the invention, ttiose tech - 
rjiquftg axe_ _jmplemented by compute r system 1800 In 
response toprocessor 1804*execuung one or more sequen ces 
of one or moreTnstructions contained in main memoryT85 6. 
Such instructions may be read into main memory 1806 from 
another computer- re ad able medium, such as storage de vice 
Rrecutj on o f the sequences of instructions c ontained 
in main memory 1806 causes processor 18 1*4 to peTftfrfnTthe 
process steps described herein. In alternative embodiments, 
hard-wired circuitry may be used in place of or in combi- 
nation with software instructions to implement the inven- 
tion. Thus, embodiments of the invention are not limited to 
any specific c ombination of ha rdware circuitry and software , 
T he term "computer-rea dable medium" as used her ein 
refers to an y medium mat p artici pates in providing ins truc- 
tions to processor i804 for execution. Such a medium may 
take many fuiuis, Including but not limited to, non-volatile 
media, volatile media, and transmission media. Non-volatile 
media includes, for example, optical or magnetic disks, such 
as storage device 1810. Volatile media includes dynamic 
memory, such as main memory 1806. Transmission media 
includes coaxial cables, copper wire and fiber optics, includ- 
ing the wires that comprise bus 1802. Transmission media 
can also take the form of acoustic or light waves, such as 25 
those generated during radio-wave and infra-r ed data com- 
munications^ . " 

^ Com^iBlTforms of computer-readable media include, for 
example, a floppy disk, a flexible disk, hard disk, magnetic 
tape, or any other magnetic medium, a CD-ROM, any other 30 
optical medium, punchcards, papertape, any other physical 
medium with patterns of holes, a RAM, a PROM, and 
EPROM, a FLASH-EPROM, any other memory chip or 
cartridge, a carrier wave as described hereinafter, or any 
other medium from which a computer can read. 

Various forms of computer readable media may be 
involved in carrying one or more sequences of one or more 
instructions to processor 1804 for execution. For example, 
the instructions may initially be carried on a magnetic disk 
of a remote computer. The remote computer can load the 
instructions into its dynamic memory and send the instruc- 
tions over a telephone line using a modem. A modem local 
to computer system 1800 can receive the data on the 
telephone line and use an infra-red transmitter to convert the 
data to an infra-red signal. An infra-red detector can receive 45 
the data carried in the infra-red signal and appropriate 
circuitry can place the data on bus 1802. Bus 1802 carries 
the data to main memory 1806, from which processor 1804 
retrieves and executes the instructions. The instructions 
received by main memory 1806 may optionally be stored on 50 
storage device 1810 either before or after execution by 
processor 1804. 

Computer system 1800 also includes a communication 
interface 1818 coupled to bus 1802. Communication inter- 
face 1818 provides a two-way data communication coupling 55 
to a network link 1820 that is connected to a local network 
1822. For example, communication interface 1818 may be 
an integrated services digital network (ISDN) card or a 
modem to provide a data communication connection to a 
corresponding type of telephone line. As another example, 60 
communication interface 1818 may be a local area network 
(LAN) card to provide a data communication connection to 
a compatible LAN. Wireless links may also be implemented. 
In any such implementation, communication interface 1818 
sends and receives electrical, electromagnetic or optical 
signals that carry digital data streams representing various 
types of information. 
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Network link 1820 typically provides data communica- 
tion through one or more networks to other data devices. For 
example, network link 1820 may provide a connection 
through local network 1822 to a host computer 1824 or to 
data equipment operated by an Internet Service Provider 
(ISP) 1826. ISP 1826 in turn provides data communication 
services through the world wide packet data communication 
network now commonly referred to as the "Internet" 1828. 
Local network 1822 and Internet 1828 both use electrical, 
electromagnetic or optical signals that carry digital data 
streams. The signals through the various networks and the 
signals on network link 1820 and through communication 
interface 1818, which carry the digital data to and from 
computer system 1800, are exemplary forms of carrier 
waves transporting the information. 

Computer system 1800 can send messages and receive 
data, including program code, through the network(s), net- 
work link 1820 and communication interface 1818. In the 
Internet example, a server 1830 might transmit a requested 
code for an application program through Internet 1828, ISP 
1826, local network 1822 and communication interface 
1818. In accordance with the invention, one such down- 
loaded application implements the techniques described 
herein. 

The received code may be executed by processor 1804 as 
it is received, and/or stored in storage device 1810, or other 
non- volatile storage for later execution. In this manner, 
computer system 1800 may obtain application code in the 
form of a carrier wave. 

In the foregoing specification, the invention has been 
described with reference to specific embodiments thereof. It 
will, however, be evident that various modifications and 
changes may be made thereto without departing from the 
broader spirit and scope of the invention. The specification 
and drawings are, accordingly, to be regarded in an illus- 
trative rather than a restrictive sense. 

What is claimed is: 

1. A method for managing files in a computer system, the 
method comprising the steps of: 

establishing an association between a type of file system 

operation, a file, and an interested entity; 
wherein the step of establishing an association includes 

storing a rule that identifies said type of file system 

operation and said interested entity; 
detecting when said type of file system operation is 

performed on said file; and 
in response to detecting that said type of file system 

operation is performed on said file, sending a message 

to said interested entity. 

2. The method of claim 1 wherein said message includes 
data that indicates that said type of file system operation was 
performed on said file. 

3. The method of claim 1 wherein: 

the step of establishing an association includes establish- 
ing an association between a file system operation, a 
directory and an interested entity by storing a rule that 
identifies said type of file system operation and said 
interested entity; 

the step of detecting when said type of file system 
operation is performed on said file includes detecting 
when said type of file system operation is performed on 
said directory; and 

the step of sending a message to said interested entity is 
performed in response to detecting that said type of file 
system operation is performed on said directory. 
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4. The method of claim 1 wherein the step of establishing 
an association between the type of file system operation, the 
file, and the interested entity is performed in response to the 
file being stored in a particular directory. 

5. The method of claim 4 further comprising the step of 
deleting the association between the type of file system 
operation, the file, and the interested entity in response to the 
file being removed from said particular directory. 

6. The method of claim 1 wherein: 
the file is stored in a database; and 

the method includes the step of performing said type of 
file operation on said file by issuing one or more 
database commands to a database server that manages 
said database. 

1. The method of claim 6 wherein the step of establishing 
an association between a type of file system operation, a file, 
and an interested entity includes storing a database record in 
said database that indicates that said interested entity should 
be sent a message when said type of file system operation is 
performed on said file. 

8. A computer-readable medium carrying instructions for 
managing files in a computer system, the instructions com- 
prising instructions for performing the steps of: 

establishing an association between a type of file system 

operation, a file, and an interested 1 entity; 
wherein the step of establishing an association includes 

storing a rule that identifies said type of file system 

operation and said interested entity; 
detecting when said type of file system operation is 

performed on said file; and 
in response to detecting that said type of file system 

operation is performed on said file, sending a message 

to said interested entity. 

9. The computer-readable medium of claim 8 wherein 
said message includes data that indicates that said type of file 
system operation was performed on said file. 

10. The computer- re ad able medium of claim 8 wherein: 
the step of establishing an association includes establish- 
ing an association between a file system operation, a 
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directory and an interested entity by storing a rule that 

identifies said type of file system operation and said 

interested entity; 
the step of detecting when said type of file system 
5 operation is performed on said file includes detecting 

when said type of file system operation is performed on 

said directory; and 
the step of sending a message to said interested entity is 
10 performed in response to detecting that said type of file 

system operation is performed on said directory. 

11. The computer-readable medium of claim 8 wherein 
the step of establishing an association between the type of 
file system operation, the file, and the interested entity is 

15 performed in response to the file being stored in a particular 
directory. 

12. The computer-readable medium of claim 11 further 
comprising the step of deleting the association between the 
type of file system operation, the file, and the interested 

20 entity in response to the file being removed from said 
particular directory. 

13. The computer-readable medium of claim 8 wherein: 
the file is stored in a database; and 

the computer-readable medium includes the step of per- 
25 forming said type of file operation on said file by 
issuing one or more database commands to a database 
server that manages said database. 

14. The computer-readable medium of claim 13 wherein 
the step of establishing an association between a type of file 

30 system operation, a file, and an interested entity includes 
storing a database record in said database that indicates that 
said interested entity should be sent a message when said 
type of file system operation is performed on said file, 

15. The method of claim 3 wherein the type of file system 
35 operation is the insertion of another file into said directory, 

16. The computer-readable medium of claim 10 wherein 
the type of file system operation is the insertion of another 
file into said directory. 

* * * * * 
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